Hippocampus-associated proteins, DNA sequences coding therefor and uses thereof

ABSTRACT

This invention provides novel hippocampus-associated proteins and DNA sequences coding therefor. In an investigation of hippocampus-associated proteins by differential screening of a rat hippocampus cDNA library, a cDNA species encoding a novel protein designated Hct-1 was isolated and shown to be a to cytochromes P450. The use of hybridization probes based on the rat Hct-1 sequence has led to the identification of homologues in other mammalian species.

[0001] This invention relates to novel hippocampus-associated proteins,to DNA sequences coding therefor, to uses thereof and to antibodies tosaid proteins. The novel hippocampus-associated proteins are believed tobe of the cytochrome P450 class.

BACKGROUND TO THE INVENTION

[0002] The identification of hippocampus-associated proteins and theisolation of cDNA molecules coding therefor is important in the field ofneurophysiology. Thus, for example, such proteins are believed to beassociated with memory functions and abnormalities in these proteins,including abnormal levels of expression and the formation of modified ormutated protein is considered to be associated with pathologicalconditions associated with memory impairment. The isolation of novelhippocampus-associated proteins and the associated DNA sequences codingtherefor is consequently of considerable importance.

[0003] The present invention arose out of our investigation ofhippocampus-associated proteins by differential screening of a rathippocampus cDNA library. A cDNA species encoding a novel protein whichwe have designated Hct-1 was isolated and shown to be related tocytochromes of the P450 class.

[0004] The use of hybridization probes based on the rat Hct-1 sequencehas led to the identification of homologues in other mammalian species,specifically mouse and human.

[0005] Cytochromes P450 are a diverse group of heme-containingmono-oxygenases (termed CYP's; see Nelson et al., DNA Cell Biol. (1993)12, 1-51) that catalyse a variety of oxidative conversions, notably ofsteroids but also of fatty acids and xenobiotics. While CYP's are mostabundantly expressed in the testis, ovary, placenta, adrenal and liver,it is becoming clear that the brain is a further site of CYP expression.Several CYP activities or mRNA's have been reported in the nervoussystem but these are predominantly of types metabolizing fatty acids andxenobiotics (subclasses CYP2C, 2D, 2E and 4). However, primary ratbrain-derived glial cells have the capacity to synthesize pregnenoloneand progesterone in vitro. Mellon and Deschepper, Brain Res. (1993),629, 283-292(9) provided molecular evidence for the presence, in brain,of key steroidogenic enzymes CYP11A1 (scc) and CYP11B1 (11β) but failedto detect CYP17 (c17) or CYP11B2 (AS). Although CYP21A1 (c21) activityis reported to be present in brain, authentic CYP21A1 transcripts werenot detected in this tissue.

[0006] Interest in steroid metabolism in brain has been fuelled by thefinding that adrenal- and brain-derived steroids (neurosteroids) canmodulate cognitive function and synaptic plasticity. For instance,pregnenolone and steroids derived from it are reported to have memoryenhancing effects in mice. However, the full spectrum of steroidmetabolizing CYP's in brain and the biological roles of theirmetabolites in vivo has not been established.

[0007] To investigate such regulation of brain function our studies havefocused on the hippocampus, a brain region important in learning andmemory. Patients with lesions that include the hippocampus displaypronounced deficits in the acquisition of new explicit memories whilematerial encoded long prior to lesion can still be accessed normally. Inrat, neurotoxic lesions to the hippocampus lead to a pronouncedinability to learn a spatial navigation task, such as the water maze.The role of the hippocampus in learning has been further emphasized bythe finding that hippocampal synapses, notably those in region CA1,display a particularly robust form of activity-dependent plasticityknown as long term potentiation (LTP). This phenomenon satisfies some ofthe requirements for a molecular mechanism underlying memoryprocesses—persistence, synapse-specificity and associativity. LTP isthought to be initiated by calcium influx through the NMDA (N-methylD-aspartate) subclass of receptor activated by the excitatoryneurotransmitter, L-glutamate, and occlusion of NMDA receptors in vivowith the competitive antagonist AP5 both blocks LTP and the acquisitionof the spatial navigation task.

[0008] The induction of LTP is attenuated by simultaneous release ofgamma-amino butyric acid (GABA) from inhibitory interneurons: activationof GABA_(A) receptors antagonizes L-glutamate induced depolarization ofthe postsynaptic neuron and interplay between the GABA and L-glutamatereceptor pathways is thought to modulate the establishment of LTP.Interplay between these two circuits is emphasised by the finding thatsome aesthetics (e.g. ketamine) act as antagonists of the NMDA receptorwhile others, such as the steroid aesthetic alfaxolone, are thought tobe agonists of the GABA_(A) receptor. It is of particular note that somenaturally occurring steroids, such as pregnenolone sulfate, act asagonists of the GABA_(A) receptor, while pregnenolone sulfate is alsoreported to increase NMDA currents. Although neurosteroids principallyappear to exert their effects via the GABA_(A) and NMDA receptors, therehave been indications that neurosteroids may also interact with sigmaand progesterone receptors.

[0009] Despite considerable interest in the action of neuro-activesteroids, and possible roles in modulating synaptic plasticity and brainfunction, little is known of pathways of steroid metabolism in thecentral nervous system. As part of a study into the molecular biology ofthe hippocampal formation, and the mechanisms underlying synapticplasticity, we have sought molecular clones corresponding to mRNA'sexpressed selectively in the formation. One such cDNA, Hct-1 (forhippocampal transcript), was isolated from a cDNA library prepared fromadult rat hippocampus. Sequence analysis has revealed that Hct-1 is anovel cytochrome P450 most closely related to cholesterol- andsteroid-metabolizing CYP's but, unlike other CYP's, is predominantlyexpressed in brain. The present invention provides molecularcharacterization of Hct-1 coding sequences from rat, mouse and humans,their expression patterns, and discusses the possible role of Hct-1 insteroid metabolism in the central nervous system.

[0010] DNA sequences encoding hitherto unknown cytochrome P450 proteinshave now been identified and form one aspect of the present invention.

SUMMARY OF THE INVENTION

[0011] According to one aspect of the present invention there are thusprovided DNA molecules selected from the following:

[0012] (a) DNA molecules containing the coding sequence set forth in SEQId No: 1 beginning at nucleotide 22 and ending at nucleotide 1541,

[0013] (b) DNA molecules containing the coding sequence set forth in SEQId No: 2 beginning at nucleotide 1 and ending at nucleotide 1242,

[0014] (c) DNA molecules capable of hybridizing with the DNA moleculedefined in (a) or (b) under standard hybridization conditions defined as2×SSC at 65° C.

[0015] (d) cytochrome P450-encoding DNA molecules capable of hybridizingwith the DNA molecule defined in (a), (b) or (c) under reducedstringency hybridization conditions defined as 6×SSC at 55° C.

[0016] Such DNA sequences can represent coding sequences of Hct-1proteins. The sequences (a) and (b) above represent the mouse and ratHct-1 gene sequence. Homologous sequences from other vertebrate species,especially mammalian species (including man) fall within the class ofDNA molecules represented by (c) or (d).

[0017] Thus the present invention further provides a DNA moleculeconsisting of sequences of the human Hct-1 gene.

[0018] These DNA sequences may be selected from the following:

[0019] (e) DNA molecules comprising one or more sequences selected from

[0020] (i) the sequence designated “intron 2” in SEQ Id No 3,

[0021] (ii) the sequence designated “exon 3” in SEQ Id No 3,

[0022] (iii) the sequence designated “intron 3” in SEQ Id No 3,

[0023] (iv) the sequence designated “exon 4” in SEQ Id No 3, and

[0024] (v) the sequence designated “intron 5” in SEQ Id No 3; and

[0025] (f) DNA molecules capable of hybridizing with the DNA moleculesdefined in (e) under standard hybridization conditions defined as 2×SSCat 65° C.

[0026] (g) cytochrome P450-encoding DNA molecules capable of hybridizingwith the DNA molecule defined in (e) or (f) under reduced stringencyhybridization conditions defined as 6×SSC at 55° C.

[0027] (h) DNA molecules comprising contiguous pairs of sequencesselected from

[0028] (i) the sequence designated “intron 2” in SEQ Id No 3,

[0029] (ii) the sequence designated “exon 3” in SEQ Id No 3,

[0030] (iii) the sequence designated “intron 3” in SEQ Id No 3,

[0031] (iv) the sequence designated “exon 4” in SEQ Id No 3, and

[0032] (v) the sequence designated “intron 5” in SEQ Id No 3; and

[0033] (i) DNA molecules capable of hybridizing with the DNA moleculesdefined in (h) under standard hybridization conditions defined as 2×SSCat 65° C.

[0034] (j) cytochrome P450-encoding DNA molecules capable of hybridizingwith the DNA molecule defined in (h) or (i) under reduced stringencyhybridization conditions defined as 6×SSC at 55° C.

[0035] (k) DNA molecules comprising a contiguous coding sequenceconsisting of the sequences “exon 3” and “exon 4” in SEQ Id No 3, and

[0036] (l) DNA molecules capable of hybridizing with the DNA moleculesdefined in (k) under standard hybridization conditions defined as 2×SSCat 65° C.

[0037] (m) cytochrome P450-encoding DNA molecules capable of hybridizingwith the DNA molecule defined in (k) or (I) under reduced stringencyhybridization conditions defined as 6×SSC at 55° C.

[0038] It will be appreciated that the DNA sequences that includeintrons (such as the sequences covered by definitions (e) to (j) above),may consist of or be derived from genomic DNA. Those sequences thatexclude introns may also be genomic in origin, but typically wouldconsist of or be or be derived from cDNA. Such sequences could beobtained by probing an appropriate library (cDNA or genomic) usinghybridisation probes based upon the sequences provided according to theinvention, or they could be prepared by chemical synthesis or byligation of sub-sequences.

[0039] The invention further provides DNA molecules encoding an Hct-1gene-associated sequence coded for by a DNA molecule as defined above,but which differ in sequence from said sequences by virtue of one ormore amino acids of said Hct-1 gene-associated sequences being encodedby degenerate codons.

[0040] The present invention further provide DNA molecules useful ashybridization probes and consisting of a contiguous sequence of at least18 nucleotides from the DNA sequence set forth in SEQ Id Nos: 1, 2 and3.

[0041] Such molecules preferably contain at least 24 and more preferablyat least 30 nucleotide taken from said sequences.

[0042] The aforementioned DNA molecules are useful as hybridizationprobes for isolating members of gene families and homologous DNAsequences from different species. Thus, for example, a DNA sequenceisolated from one rodent species, for example rat, has been used forisolating homologous sequences from another rodent species, for examplemouse and from other mammalian species , e.g. primate species such ashumans.

[0043] Such sequences may be further used for isolating homologoussequences from other mammalian species, for example domestic animalssuch as cows, horses, sheep and pigs, primates such as chimpanzees,baboons and gibbons.

[0044] DNA sequences according to the invention may be used in diagnosisof neuropsychiatric disorders, endocrine disorders, immunologicaldisorders, diseases of cognitive function, neurodegenerative diseases ordiseases of cognitive function, for example by assessing the presence ofdepleted levels of mRNA and/or the presence of mutant or modified DNAmolecules. Such sequences include hybridisation probes and PCR primers.The latter generally would be short (e.g. 10 to 25) oligonucleotides inlength and would be, capable of hybridising with a DNA molecule asdefined above. The invention includes the use of such primers in thedetection of genomic or cDNA from a biological sample for the purpose ofdiagnosis of neuropsychiatric disorders, endocrine disorders,immunological disorders, diseases of cognitive function orneurodegenerative diseases.

[0045] The present invention further provides hippocampus-associatedproteins as such, encoded by the DNA molecules of the invention.

[0046] In particular, there is provided

[0047] (i) the protein designated rat Hct-1 comprising the amino acidsequence set forth in SEQ Id No: 1 or a protein having substantialhomology thereto,

[0048] (ii) the protein designated mouse Hct-1 comprising the amino acidsequence set forth in SEQ Id No: 2 or a protein having substantialhomology thereto, or

[0049] (iii) the protein designated human Hct-1 comprising the aminoacid sequence set forth in SEQ Id No: 3 or a protein having substantialhomology thereto.

[0050] By “substantial homology” is meant a degree of homology such thatat least 50%, preferably at least 60% and most preferably at least 70%of the amino acids match. The invention of course covers relatedproteins having a higher degree of homology, e.g. at least 80%, at least90% or more.

[0051] The Hct-1 polypeptides may be produced in accordance with theinvention by culturing a transformed host and recovering the desiredHct-1 polypeptide, characterised in that the host is transformed withnucleic acid comprising a coding sequence as defined above.

[0052] Examples of suitable hosts include yeast, bacterial, insect ormammalian cells. Although vectorless expression may be employed, it ispreferred that the nucleic acid used to effect the transformationcomprises an expression construct or an expression vector, e.g. avaccinia virus, a baculovirus vector, a yeast plasmid or integrationvector.

[0053] The invention further provides antibodies, especially monoclonalantibodies which bind to Hct-1 proteins. These and the proteins of theinvention may be employed in the design and/or manufacture of anantagonist to Hct-1 protein for diagnosis and/or treatment of diseasesof cognitive function or neurodegenerate diseases. The use ofHct-1-associated promoters in the formation of constructs for use in thecreation of transgenic animals is also envisaged according to theinvention. The antibodies of the invention may be prepared inconventional manner, i.e. by immunising animal such as rodents orrabbits with purified protein obtained from recombinant yeast, or byimmunising with recombinant vaccinia.

[0054] Hct-1 proteins provided according to the invention possesescatalytic activity, thus they may be used in industrial processes, toeffect a catalytic transformation of a substrate. For example, where thesubstrate is a steroid, the proteins may be used to catalysestereospecific transformations, e.g. transformations involving oxygentransfer.

DESCRIPTION OF DRAWINGS (SEE ALSO FIGURE LEGENDS—7 INFRA)

[0055]FIG. 1 illustrates (a) a restriction map of clone 12 and (b) thecomplete nucleotide and translation sequence of the 1.4 kb cDNA clone ofrat Hct-1,

[0056]FIG. 2 illustrates Northern analysis of Hct-1 expression in adultrat and mouse brain, and other tissues,

[0057]FIG. 3 illustrates (a) restriction maps of clones 35 and 40 and(b) the complete nucleotide and translation sequence of mouse Hct-1cDNA,

[0058]FIG. 4 illustrates an alignment of mouse Hct-1 with human CYP7 andhighlights regions homologous to other steroidogenic P450s,

[0059]FIG. 5 illustrates an analysis of Hct-1 expression in mouse brain,

[0060]FIG. 6 illustrates Southern analysis of Hct-1 coding sequences inmouse, rat and human.

[0061]FIG. 7 illustrates Southern blot analyses of mouse genomic DNAusing (a) a full length mouse Hct-lcDNA clone and (b) rat genomic DNAprobed with clone 14.5a,

[0062]FIG. 8 illustrates a genomic map of mouse Hct-1,

[0063]FIG. 9 illustrates a partial nucleotide sequence of human genomicHct-1 (CYP7B1) and the encoded polypeptide,

[0064]FIG. 10 illustrates an amino acid alignment of mouse Hct-1 andhuman CYP7,

[0065]FIG. 11A illustrates Kozak sequences in mRNAs for steroidogenicP540's,

[0066]FIG. 11B illustrates mutagenesis of the 5′ end of the mouse Hct-1cDNA to sreate a near-consensus translation initiation regionsurrounding the ATG (AUG),

[0067]FIG. 12 illustrates yeast expression vectors containing the mouseHct-1 coding sequence, and

[0068]FIG. 13 illustrates a vaccinia expression vectors containing themouse Hct-1 coding sequence.

DESCRIPTION OF SPECIFIC EMBODIMENTS

[0069] Details of the isolation of hippocampus-associated DNA moleculesaccording to the invention will now be described by way of example:

[0070] 1. Isolation of Gene Encoding Rat HCT-1

[0071] 1.1 Differential Screening of a Rat Hippocampus cDNA Library

[0072] To identify genes whose expression is enriched in the hippocampalformation we performed a differential hybridization screen of ahippocampal cDNA library. Adult rat hippocampal RNA was reversetranscribed using a oligo-dT-NotI primer, converted to double-strandedcDNA, EcoRI adaptors were attached and the cDNA's were inserted betweenthe EcoRI and NotI sites of a bacteriophage lamda vector.

[0073] 1.1.1 Preparation of cDNA Libraries

[0074] Following anaesthesia (sodium pentobarbital) of adult rats(Lister hooded) the hippocampal formation was dissected, including areasCA1-3 and dentate gyrus, subiculum, alvear and fimbrial fibres butexcluding fornix and afferent structures such as septum and entorhinalcortex. Remainder of brain was also pooled taking care to excludehippocampal tissue. Total RNAs were prepared by a standard guanidiniumisothiocyanate procedure, centrifugation through a CsCl cushion, andpoly-A⁺ mRNA selected by affinity chromatography on oligo-dT cellulose.First strand cDNA synthesis used a NctI adaptor primer

[0075] [5-dCAATTCGCGGCCGC(T)₁₅-3′]

[0076] and Moloney murine leukemia virus (MMLV) reverse transcriptase;second strand synthesis was performed by RNaseH treatment, DNApolymerase I fill-in and ligase treatment. Following the addition ofhemi-phosphorylated EcoRI adaptors (5′-dCGACAGCAACGG-3′ and5′-dAATTCCGTTGCTGTCG-3′) and cleavage with NotI the cDNA was insertedbetween the Noti and EcoRI sites of bacteriophage lambda vectorlambda-ZAPII (Stratagene).

[0077] 1.1.2 Differential Hybridization Screening

[0078] Recombinant bacteriophage plaques were transferred in duplicateto Hybond-N membranes (Amersham), denatured (0.5 M NaOH, 1.5 M NaCl, 4min), renatured (1 M Tris.HCl pH 7.4, 1.5 M NaCl), rinsed, dried andbaked (2 h, 80° C.). Hybridization as described (Church et al., Proc.Natl. Acad. Sci. USA (1984), 81 1991-1995) used a radiolabelled probeprepared by MMLV reverse transcriptase copying of polyA⁻ RNA (fromeither hippocampus or the remainder of brain) into cDNA in the presenceof α-³²P-dCTP and unlabelled dGTP, dATP and dTTP according to standardprocedures. Following washing and exposure for autoradiography,differentially hybridizing plaques were repurified. Inserts weretransferred to a pBluescript vector either by cleavage and ligation orby using in vivo excision using the ExAssist/SOLR system (Stratagene).

[0079] Duplicate lifts from 500,000 plaques were screened withradiolabelled cDNA probes prepared by reverse transcription of RNA fromeither hippocampus (Hi) or ‘rest of brain’ (RB). Approximately 360clones gave a substantially stronger hybridization signal with the Hiprobe than with the RB probe; 49 were analysed in more depth. In vivoexcision was used to transfer the inserts to a plasmid vector forpartial DNA sequence studies. Of these, 21 were novel (not presentedhere); others were known genes whose expression is enriched inhippocampus but not specific to the formation (eg., the ratamyloidogenic protein. Northern analysis was first performed usingradiolabelled probes corresponding to the 21 novel sequences. Whilethree (12.10a, 14.5a and 15.13a) identified transcripts specific to thehippocampus, 12.10a and 15.13a both hybridized to additional transcriptswhose expression was not restricted to the formation. Clone 14.5aappeared to identify transcripts enriched in hippocampus and was dubbedHct-1.

[0080] 1.2 Characterisation of Rat Hct-1

[0081] 1.2.1 Rat Hct-1 Encodes a Cytochrome P450

[0082] To extend this characterization, the insert of clone 14.5a (300nt) was used to rescreen the hippocampal cDNA library. 4 positives wereidentified (clones 14.5a-5, −7, −12 and −13), and the region adjacent tothe poly-A tail analysed by DNA sequencing. While clones 5 (0.7 kb) and12 (1.4 kb) had the same 3′ end as the parental clone, clone 7 (0.9 kb)had a different 3′ end consistent with utilization of an alternativepolyadenylation site. Clone 13 (2.5 kb), however, appeared unrelated toHct-1 and was dubbed Hct-2.

[0083] Clones 12 and 7 were then fully sequenced and the sequencesobtained were compared with the database. Significant homology wasdetected between clone 12 and the human and rat cDNA's encodingcholesterol 7α-hydroxylase, though the sequences are clearly distinct.At the nucleic acid level, the 1428 nt cDNA clone for rat Hct-1 shared55% identity over an 1100 nt overlap with human cholesterol7α-hydroxylase (CYP7) and 54% identity over a 1117 nt overlap with ratCYP7. FIG. 1 gives the partial cDNA sequences of rat Hct-1 and theencoded polypeptide.

[0084] 1.2.2 Nct-1 mRNA Expression in Rat

[0085] Rat Hct-1 clone 14.5a/12 (1.4 kb) was used to investigate theexpression of Hct-1 mRNA in rat brain and other organs. We firstperformed in situ hybridization to sections of rat brain. While thesepreliminary experiments did not permit unambiguous localization of Hct-1transcripts, we confirmed expression in the hippocampus, predominantlyin the cell layers of the dentate gyrus, while weaker expression wasdetected in other hippocampal and brain regions (not presented).Northern analysis was then performed on RNA prepared from differentsections of rat brain. In FIG. 2A the Hct-1 probe identifies threetranscripts in hippocampus of 5.0, 2.1 and 1.8 kb, with the two smallertranscripts being particularly enriched in hippocampus. The largertranscript was only detectable in brain, while the two smallertranscripts were also present in liver (and, at much lower levels, inkidney) but not in other organs tested including adrenal (not shown),testis, and ovary. In brain, expression was also detected in olfactorybulb and cortex while very low levels were present in cerebellum (FIG.2A).

[0086] 1.2.3 Sexual Dimorphism of Hct-1 Expression in Liver but not inBrain

[0087] The expression of several CYPS is known to be sexually dimorphicin liver. We therefore inspected liver and brain of male and female ratsfor the presence of Hct-1 transcripts. In FIG. 2B the Hct-1 proberevealed the 1.8 and 2.1 kb (and 5.0 kb, Hct-2) transcripts in both maleand female brain, with the 2.1 kb Hct-1 transcript predominating. Levelsof Hct-1 mRNA's in liver were reduced greater than 20-fold over thosedetected in brain. Furthermore, Hct-1 transcripts were only significantin liver from male animals; expression of Hct-1 in females was barelydetectable demonstrating that hepatic expression of Hct-1 is sexuallydimorphic.

[0088] 2. Isolation of Mouse HCT-1

[0089] 2.1 Isolation of Mouse Hct-1 cDNA Clones

[0090] A mouse liver cDNA library, established as Notl-EcoRi fragmentsin a lambda-gt10 vector, was probed using a rat Hct-1 probe. The librarywas a kind gift of B. Luckow and K. Kästner, Heidelberg.

[0091] Because the transcripts identified by the Hct-1 probe(predominantly 1.8 and 2.1 kb) are clearly longer than the longest cDNAclone (1.4 kb) obtained from our rat hippocampus library, we thereforeelected to pursue studies with the mouse Hct-1 ortholog. A mouse livercDNA library was screened using a rat Hct-1 probe and four clones wereselected, none containing a poly-A tail. Two (clones 33 and 35, both 1.8kb) gave identical DNA sequences at both their 5′ and 3′ ends, and thissequence was approximately 91% similar to rat Hct-1. The remaining twoclones, 23 and 40, were also identical to each other and were related tothe other clones except for a 5′ extension in (59 nt) and a 3′ deletion(99 nt). The complete DNA sequences of clones 35 and 40 were thereforedetermined.

[0092] The sequences obtained were identical throughout the region ofoverlap. The mouse Hct-1 open reading frame (ORF) commences with amethionine at nucleotide 81 (numbering from clone 40) and terminateswith a TGA codon at nucleotide 1600, encoding a protein of 507 aminoacids (FIG. 3). At the 5′ end it is of note that the ATG initiationcodon leading the ORF does not correspond to the translation initiationconsensus sequence YYAYYATGR. However, the 5′ untranslated region clonedis devoid of other possible initiation codons and an in-frametermination triplet (TAA) lies 20 codons upstream of the ATG. Theencoded polypeptide sequence aligns well with other cytochrome P450sequences and we surmise that the ATG at position 81 represents thecorrect start site for translation. At the 3′ end the truncation ofclone 40 lies entirely in the non-coding region downstream of the stopcodon. Neither clone contained a poly-A tail but both contained apotential polyadenylation sequence (AATAAA) at a position correspondingprecisely to that seen in the rat cDNA.

[0093] 2.2 Structure of Mouse Hct-1 Polypeptide

[0094] As anticipated, nucleotide sequence homology of mouse Hct-1 washighest with human cholesterol 7α-hydroxylase, with approximately 56%identity over the coding region. At the polypeptide level the mouse ORFshows 81% identity to the rat Hct-1 polypeptide over 414 amino acids;the precise degree of similarity may be different as the full proteinsequence of rat Hct-1 is not known. Both the human (CYP7) and ratcholesterol 7α-hydroxylase polypeptides share 39% amino acid sequenceidentity to mouse Hct-1. FIG. 4A presents the alignment of mouse Hct-1polypeptide with human CYP7.

[0095] The N-terminus of the Hct-1 polypeptide is hydrophobic, a featureshared by microsomal CYP's. This portion of the polypeptide is thoughtto insert into the membrane of the endoplasmic reticulum, holding themain bulk of the protein on the cytoplasmic side. Consistent withmicrosomal CYP's, the N-terminus lacks basic amino acids prior to thehydrophobic core (amino acids 9-34).

[0096] Several alignment studies have previously highlighted conservedregions within CYP polypeptides. We therefore inspected the Hct-1sequence for these conserved regions. CYP's contain a highly conservedmotif, FxxGxxxCxG(xxxA), present in 202 of the 205 compiled sequences(Nelson et al., supra), that is thought to represent the heme bindingsite. The arrangement of amino acids around the cysteine residue hasbeen postulated to preserve the three-dimensional structure of thisregion for ligand binding. This motif is fully conserved in Hct-1 (FIG.4B). A second conserved domain is also present in CYP's responsible forsteroid interconversions. While this domain is largely conserved inHct-1 an invariant Pro residue is replaced, in Hct-1, by Val (FIG. 4C);the rat Hct-1 polypeptide also contains a Val residue at this position.

[0097] 2.3 Expression Pattern of Mouse Hct-1

[0098] To verify enriched expression of Hct-1 in hippocampus weperformed Northern and in situ hybridization analyses on mouse material.In contrast to the situation in rat, the 1.4 kb clone 12 detected only a1.8 kb transcript; the 2.1 kb and 5.0 kb transcripts were absent fromall tissues examined (FIG. 2C). The apparent absence of the 2.1 kbtranscript may only reflect a lower abundance of this transcript.because at least some mouse cDNA clones extend beyond the upstreampolyadenylation site which is thought, in rat, to generate the shorter(1.8 kb) transcript.

[0099] To refine this analysis, a 42-mer oiigonucleotide was designedaccording to the DNA sequence of the 3′ untranslated region of the cDNAclone upstream of the first polyadenylation site (materials andmethods), so as to minimize cross-hybridization with other CYP mRNA's.Coronal sections of mouse brain were hybridized to the ³⁵S-labelledprobe and, after emulsion dipping, exposed for autoradiography (FIG. 5).Transcripts were detected throughout mouse brain, with no evidence ofrestricted expression in the hippocampus (FIG. 5A,B). Strongestexpression was observed in the corpus callosum, the anterior commisureand fornix while, as in rat, hippocampal expression was particularlyprominent in the dentate gyrus (FIG. 5C). Moderate expression levels,comparable to those observed in hippocampus, were observed incerebellum, cortex and olfactory bulb.

[0100] 2.4 The structure of the mHct-1 Gene.

[0101] The use of homologous recombination to manipulate the mouse Hct-1gene requires knowledge of the intron-exon structure of the gene.Sequences upstream of the first Hct-1 exon could also be analysed forelements which contribute to the transcriptional regulation of Hct-1expression. For these reasons, the organisation of the mouse Hct-1 genewas investigated.

[0102] To assess the complexity of the Hct-1 gene in the genome, thatis, whether the Hct-1 gene is present as a single copy in the haploidmouse genome, and to assist in mapping of mHct-1 phage clones, the 1.8kb full length mouse Hct-1 clone was ³²P-labelled by random primerlabelling and used as a probe on a Southern blot of mouse genomic DNA(FIG. 7(a)). Under high stringency conditions the Hct-1 probe recogniseda small number of bands within the mouse genomic digests, suggestingthat Hct-1 is present in the mouse genome as a single copy gene. Toconfirm this, the original 0.3 kb cDNA clone, 14.5a, was used to probe arat genomic Southern blot. The smaller probe hybridised to a single bandin BamHI-, EcoRI-, and XbaI-digested genomic rat DNA (FIG. 7(b)).

[0103] A mouse genomic DNA library (a gift from A. Reaume, Toronto)prepared from ES cells derived from the 129 mouse strain was screenedfor genomic clones containing mHct-1 exonic sequence. 750,000recombinant phage of the lambda DASH II library were plated at a densityof 50,000 recombinants per 15 cm plate. Duplicate lifts were made andprobed with the 1.4 kb rat Hct-1 clone. After the primary screen, 5clones were isolated. After secondary screening, three of these phageclones were positive and were purified.

[0104] Small scale phage DNA was prepared from each phage lysate and cutwith NotI to release the inserts. No internal NotI sites were found inany of the clones. Clone I-2 contained a 14 kb insert; clone I-6contained a 15 kb insert, and clone I-11 contained a 12 kb insert.

[0105] These phage clones were mapped by a combination of restrictionenzymes which either cut the lambda clones rarely, or by usingrestriction sites found in the mHct-1 cDNA sequence (FIG. 3). A 5′ probewas created using a 200 bp fragment from the 5′ end of mHct-1 cDNA as aprobe; this segment extended from the internal BamHI site to an, EcoRIsite located in the polylinker. The 200 bp 3′ cDNA probe extended fromthe Sacl site to the polylinker NotI site. Exon-intron boundaries weredetermined by subcloning of exon-containing genomic DNA fragments andsequencing (FIG. 8).

[0106] Phage clones I-6 and I-11 represented 20 kb of contiguoussequence of the Hct-1 locus. I-2 does not overlap withI-6 or I-11, thusthe map of the Hct-1 gene in mouse is incomplete. However, the presentmap shows that mHct-1 spans at least 25 kb of the genome. At least twoexons are contained within I-6. The first exon (referred to as exon II)contains 133 bp of coding sequence, followed by exon III, located 4.0 kbdownstream. The 3′ boundary of this latter exon is not defined, howeverapproximately 400 bp downstream of its 3′ boundary commences exon IV,which together comprise 797 bp of coding sequence. Exon III and IV arealso represented in the overlapping sequence of I-11. A fourth exon ofat least 345 bp was identified in I-2 (referred to as exon VI). The 3′boundary of this exon has not been identified, thus it is not knownwhether this contains the remaining coding sequence or if there areadditional exons.

[0107] The following Table provides a summary of the exon-intronstructure of Hct-1 (incomplete) and comparison to human CYP7 genestructure. * indicates that these exons are not cloned and are notnecessarily one exon. * * indicates that the 3′ boundary of exon VI isnot confirmed and may not necessarily be the final exon. cDNA sequenceExon represented exon size (bp) CYP7 exon (bp) I*   1-142 142 144 II 143-275 133 241 III  276-? 797 587 IV   ?-1072 ″ 131 V* 1073-1246 174176 VI** 1247-(1821) (575) 1596

[0108] As shown in the Table, cDNA sequence from nucleotides 1073-1246is not represented in the identified exons and must be represented in aseparate exon. 142 bp of 5′ sequence and 227 bp of 3′ sequence have notyet been located in the genomic clones. The remaining 5′ sequence ismost likely contained in one exon, as the 5′ probe (BamHI fragment)consistently recognised two bands by Southern analysis (one of which isexon II sequence). The remaining 3′ sequence has not been located andmay be part of exon VI or be encoded by a separate exon.

[0109] 3. Isolation of Human Genomic Sequences for HCT-1

[0110] 3.1 Conservation of Hct-1 in Humans.

[0111] The evolutionary conservation of a gene supports a functionallysignificant role for that gene in the organism. The conservation ofHct-1 in rodents has been demonstrated by the cloning of the rat andmouse cDNAs for Hct-1. To establish the presence of the Hct-1 gene inthe human genome, Southern blotting of human DNA was performed. The rat1.4 kb clone of Hct-1 was used as a radiolabelled probe and gave strongsignals from all three species (FIG. 6). A number of hybridisingfragments appear to be conserved between species, suggestingconservation of the Hct-1 gene structure. There is a conserved 1.4 kbHindlul band between mouse and rat, while human DNA contains a slightlylarger HindIII band of 1.6 kb. Also an EcoRI fragment of 11 kb isconserved in human and rat Hct-1. Conservation of Hct-1 gene structureis also supported from the cDNA digestion patterns of mouse and rat (seeFIGS. 6 and 7), where the SacI, HindIII and PstI sites are conservedbetween the rodent species.

[0112] 3.2 A Single Gene for Hct-1 in Mouse, Rat and Human

[0113] Because CYP's comprise a family of related enzymes we wished todetermine whether close homologs of Hct-1 are present in the mammaliangenome. The rat Hct-1 probe (1.4 kb) was used to probe a genomicSouthern blot of rat, mouse and human DNA. In FIG. 6 the probe revealeda simple pattern of cross-hybridizing bands in all DNA's examined. InBamHI-cut human DNA only a single major cross-hybridizing band (4 kb)was detected (FIG. 6), while reprobing with the 300 nt. clone 14-5ayielded, in each lane, a single cross-hybridizing band (not shown).These data argue that a single conserved Hct-1 gene is present in mouse,rat and human, and that the mammalian genome does not contain very closehomologs of Hct-1 that would be detected by cross-hybridization (>70-80%homology).

[0114] 3.3 Isolation of Sequences Encoding Human Hct-1

[0115] The rat cDNA clone 14.5a-12 was used to probe a Southern blot ofhuman genomic DNA digested with BamHI according to standard procedures.A single band at 3.8 kb was identified that cross-hybridises with theprobe. Accordingly, 20 μg of human genomic DNA was cleaved to completionwith BamHI, resolved by agarose gel electrophoresis, and the size range3.4-4.2 kb selected by reference to markers run on the same gel. The gelfragment was digested by agarase treatment, DNA was purified by phenolextraction and ethanol precipitation, and ligated into BamHI-cutbacteriophage lambda ZAP vector (Stratagene). Following packaging invitro and plating on a lawn of E. coli strain XL1-Blue , plaque lifts of100,000 clones were screened for hybridisation to the rat cDNA. 12positive signals were identified and all contained a 3.8 kb insert. Onewas selected and the segment was partially sequenced, identifying tworegions of high homology to the rat (and mouse) cDNA's and correspondingto exons 3 and 4. FIG. 9 presents the nucleotide sequence and FIG. 10compares the human Hct-1 translation product with the cognate mcusepolypeptide.

[0116] To extend this characterisation, the 3.8 kb BamHI fragmentobtained from the size-selected library was used to screen a genomiclibrary of human DNA prepared by partial Sau3A cleavage and insertion of14-18 kb fragments into a bacteriophage lambda vector according tostandard techniques (gift of Dr. P. Estibeiro, CGR). Positive cloneswere obtained, and restriction mapping of one confirmed that it containsapproximately 14 kb of human DNA encompassing the exons identified aboveand further regions of the Hct-1 gene; together the different genomicclones are thought to encompass the entire Hct-1 gene. The human genomicsequence may be used to screen human cDNA libraries for full length cDNAclones; alternatively, following complete DNA sequence determination thehuman genomic sequence may be expressed in mammalian cells by adjoiningit to a suitable promoter sequence and cDNA prepared from the correctlyspliced mRNA product so produced. Finally, the genomic Hct-1 sequencewould permit the entire coding sequence to be deduced so permitting theassembly of a full length Hct-1 coding sequence by de novo synthesis.

[0117] 3.4 Expression of Hct-1 Protein for Enzymatic Activity Analysis

[0118] 3.4.1. Expression of Hct-1 Polypeptide in Yeast Cells

[0119] Recombinant yeast strains are useful vehicles for the productionof heterologous cytochrome P450 proteins. It would be possible toexpress any of the mammalian Hct-1's in yeast, but for simplicity weselected the mouse Hct-1 clone 35. To introduce the mouse Hct-1 (mHct-1)coding sequence into yeast the expression vector pMA91 (Kingsman et al.,Meth. Enzymol. 185: 329-341, 1990) was employed. The unique Bgll site inpMA91 was converted to a NotI site by inserting the oligonucleotide 5′GATCGCGGCCGC3′ according to standard procedures. Following cleavage ofthe resulting plasmid (pMA91 -Not) with NotI the mHct-1 cDNA clone 35was introduced, placing mHct-1 expression under the control of the yeastPGK (phosphoglycerokinase) promoter for high level expression in yeastcells (FIG. 12A). A similar construct utilising the mHct-1 cDNA clone 35is depicted in FIG. 12B. Expression of mHct-1 in yeast using theseplasmid permits the purification of the protein and determination ofsubstrate specificity.

[0120] 3.4.2. Expression of Hct-1 Polypeptide in Vaccinia Virus

[0121] Expression in vaccinia virus is a routine procedure and has beenwidely employed for the expression of heterologous cytochromes P450 inmammalian cells, including HepG2 and Hela cells (Gonzalez, Aoyama andGelboin, Meth. in Enzymol. 206: 85-92, 1991; Waxman et al., ArchivesBiochem. Biophys 290, 160-166,1991). Accordingly we selected plasmidpTG186-poly (Lathe et al., Nature 326, 878-880, 1987) as thetransfer/expression vector, although other similar vectors are widelyavailable and may also be employed.

[0122] To demonstrate the expression of mammalian Hct-1's in vacciniavirus, for simplicity we selected the mHct-1 clone 35. Similartechniques are applicable to rat and human Hct-1's. To enhanceexpression we elected to modify the 5′ end to conform better to thetranslation consensus for mammalian cells (YYAYYATGR) though thismodification may not be essential.

[0123] Accordingly, two oligonucleotides were designed corresponding tothe 5′ and 3′ regions of the mouse cDNA.

[0124] The 5′ oligonucleotide:

[0125] (5′-GGCCCTCGAGCCACCATGCAGGGGAGCCACG-3′)

[0126] is homologous to the region surrounding the translationinitiation site but converts the sequence immediately prior to the ATGto the sequence CCACC; in addition, the oligonucleotide contains a Xholrestriction site for subsequent cloning. The 3′ oligonucleotide(GGCCGAATTCTCAGCTTCTCCAAGAA) was chosen according to the sequencedownstream of the translation stop site and contains, in addition, anEcoRI site for subsequent cloning. These oligonucleotides were employedin polymerase chain reaction (PCR) amplification through 5 cycles on theclone 35 template; the products were applied to an agarose gel and thedesired product band at 1.65 kb was cut out and extracted by standardprocedures.

[0127] Following cleavage with XhoI and EcoRI the modified fragment wasintroduced between the EcoRI and SalI sites of pTG186-poly, generatingpVV-mHct-1. Recombinational exchange was used to transfer the expressionvector to the vaccinia virus genome according to standard procedures,generating VV-mHct-1, as depicted in FIG. 13. This recombinant willpermit the expression of high levels of mHct-1 and the identification ofthe substrate specificity of the protein, as well as the production ofantibodies directed against mHct-1.

[0128] To identify the product of P450-mediated metabolism, microsomesmay easily be prepared (Waxman, Biochem. J. 260: 81-85, 1989) fromvaccinia-infected cells: these are incubated with labelled precursors,eg. steroids, and the product identified by thin layer chromatographyaccording to standard procedures (Waxman, Methods in Enzymology206:462-476).

[0129] The Hct-1 provided according to this invention thereby provides aroute for the large-scale production of the product described above, forinstance a modified steroid, by expressing the P450 in a recombinantorganism and supplying the substrate for conversion. It will also bepossible to engineer recombinant yeast, for instance, to synthesise thesubstrate for the Hct-1 P450 in vivo, so as to allow production of theHct-1 product from yeast supplied with a precursor, for instancecholesterol or other molecule, if that yeast is engineered to containother P450's or modifying enzymes. It may be possible for Hct-1 to acton endogenous sterols and steroids in yeast to yield product.

[0130] Finally, the Hct-1 product may be part of a metabolic chain, andrecombinant organisms may be engineered to contain P450's or otherenzymes that convert the Hct-1 product to a subsequent product that mayin turn be harvested from the organism.

[0131] 4. Discussion

[0132] In experiments to characterize transcripts enriched in thehippocampal formation we isolated cDNA clones corresponding to Hct-1(hippocampal transcript) from a library prepared from rat hippocampusRNA. In rat, expression appeared to be most abundant in hippocampus withsome expression in cortex and substantially less expression other inbrain regions. Elsewhere in the body transcripts were only detected inliver and, to a lesser extent, in kidney; expression was barelydetectable in ovary, testis and adrenal, also sites of steroidtransformations. Hepatic expression was sexually dimorphic with Hct-1mRNA barely detectable in female liver. In rat brain and liver, Hct-1identifies two transcripts of 1.8 and 2.1 kb that appear to be generatedby alternative polyadenylation; a 5.0 kb transcript weakly detected inbrain is thought not to originate from the Hct-1 gene but insteadencodes a polypeptide related to the GTPase activating protein, ABR(active BCR-related).

[0133] Sequence analysis of Hct-1 cDNA clones revealed an extensive openreading frame encoding a protein with homology to cytochromes P450(CYP's), a family of heme-containing mono-oxygenases responsible for avariety of steroid and fatty acid interconversions and the oxidativemetabolism of xenobiotics. Although the mouse cDNA coding region appearscomplete, the absence of a consensus translation initiation siteflanking the presumed initiation codon could indicate that Hct-1polypeptide synthesis is subject to regulation at the level oftranslation initiation.

[0134] Homology was highest with rat and human cholesterol7α-hydroxylase, known as CYP7. While related, Hct-1 is clearly distinctfrom CYP7, sharing only 39% homology over the full length of theprotein. CYP polypeptides sharing greater than 40% sequence identity aregenerally regarded as belonging to the to the same family, and Hct-1 andCYP7 (39% similarity) are hence borderline. The conservation of otherunique features between Hct-1 and CYP7 however argues for a closerelationship and Hct-1 has been provisionally named ‘CYP7B’ by the P450Nomenclature Committee (D. R. Nelson, personal communication).

[0135] From the Hct-1 leader sequence we surmise that the Hct-1polypeptide resides, like CYP7, in the endoplasmic reticulum and not inmitochondria, the other principal cellular site of CYP activity. Thestrictly conserved heme binding site motif FxxGxxxCxG(xxxA) is clearlypresent in Hct-1 (residues 440-453). It is of note that the‘steroidogenic domain’, conserved in many CYP's responsible for steroidinterconversions, is also present in Hct-1 (amino acids 348-362), exceptthat a consensus Pro residue is replaced by Val in both the mouse andrat Hct-1 polypeptides. Of previously known 34 CYP sequences, only 4contain an amino acid residue other than Pro at this position. Whereas 2of these harbour an unrelated amino acid (Glu; CYP3A1, CYP3A3),interestingly, a Val residue is present in bovine CYP17 (steroid17α-hydroxylase, 44) at a position equivalent to that in Hct-1 whilehuman CYP17 harbours a conservative substitution at this site (Leu; 44).Despite this similarity, however, the overall extent of homology betweenHct-1 and CYP17 (22.5%, not shown) is lower than with CYP7 (39%)

[0136] Neither Hct-1 and CYP7 appear to contain a conserved O₂ bindingpocket (equivalent to residues 285-301 in Hct-1). Crystallographicstudies on the bacterial CYP101 indicated that a Thr residue(corresponding to position 294 in Hct-1) disrupts helix formation inthat region and is important in providing a structural pocket for anoxygen molecule. Site-directed mutagenesis of this Thr residue in bothCYP4A1 and CYP2C11 demonstrated that this region can influence substratespecificity and affinity. In both Hct-1 and CYP7 the conserved Thrresidue is replaced by Asn. This modification suggests that Hct-1 andCYP7 are both structurally distinct from other CYP's in this region;this may be reflected both in modified oxygen interaction and substratechoice.

[0137] The sexual dimorphism of Hct-1 expression observed in ratresembles that observed with a number of other CYP's. CYP2C12 isexpressed preferentially in liver of the female rat while, like Hct-1,CYP2C11 is highly expressed in male liver but only at low levels in thefemale tissue. This dimorphic expression pattern of CYP2C family membersis thought to be determined by the dimorphism of pulsatility of growthhormone secretion. Brain expression of Hct-1 is not subject to thiscontrol suggesting that regulatory elements determining Hct-1 expressionin brain differ from those utilized in liver. However, we have notexamined species other than rat; it cannot be assumed that the sameregulation will exist in other species. Indeed, sexually dimorphic geneexpression is not necessarily conserved between different strains ofmouse.

[0138] Expression of Hct-1 was widespread in mouse brain. The expressionpattern was most consistent with glial expression but furtherexperiments will be required to compare neuronal and non-neuronal levelsof expression. In mouse brain only the 1.8 kb transcript was detected,though cDNA's were obtained corresponding to transcripts extendingbeyond the first polyadenylation site; such extended transcripts arethought to give rise to the 2.1 kb transcript in rat. This suggests thedownstream polyadenylation site seen in rat Hct-1 is under-utilized inmouse Hct-1 or absent. While in situ hybridization studies of Hct-1 inrat brain were inconclusive, a difference in expression pattern betweenmouse and rat appears likely; further work will be required to confirmthis. However, such a difference would be unsurprising becausecytochromes P450 are well known to vary widely in their level andpattern of expression in different species; for instance, hepatictestosterone 16-hydroxylation levels differ by more than 100-foldbetween guinea pig and rat.

[0139] Our data indicate that the Hct-1 gene is present in rat, mouseand human, and there appear to be no very close relatives in themammalian genome. While CYP genes are scattered over the mouse and humangenomes, CYP subfamilies can cluster on the same chromosome. Forinstance, the human CYP2A and 2B subfamily genes are linked tochromosome 19, CYP2C and 2E subfamilies are located on human chromosome10, and the mouse cyp2a, 2b and 2e subfamilies are present on mousechromosome 7. The gene encoding human cholesterol 7α-hydroxylase (CYP7)is located on chromosome 8q 11 -q 12.

[0140] Together our data argue that Hct-1 and CYP7 are closely related:this suggests that the substrate for Hct-1, so far unknown, is likely tobe related to cholesterol or one of its steroid metabolites. Thisinterpretation is borne out by the presence, in Hct-1, of thesteriodogenic domain conserved in a number of steroid-metabolizingCYP's. While experiments are underway to determine the substratespecificity of Hct-1, the possibility that Hct-1 acts on cholesterol orits steroid metabolites in brain is of some interest. CYP7 (cholesterol7α-hydroxylase) is responsible for the first step in the metabolicdegradation of cholesterol. This is of note in view of the associationof particular alleles of the APOE gene encoding the cholesteroltransporter protein apoiipoprotein E with the onset of Alzheimer'sdisease, a neurodegenerative condition whose cognitive impairments areassociated with early dysfunction of the hippocampus.

[0141] What role might Hct-1 play in the brain? In the adult CYP's aregenerally expressed abundantly in liver, adrenal and gonads, while thelevel of CYP activity in brain is estimated to be 0.3 to 3% of thatfound in liver (see 58). Because levels of Hct-1 mRNA expression in ratand mouse brain far exceed those in liver it could be argued that theprimary function of Hct-1 lies in the central nervous system. Thedocumented ability of cholesterol-derived steroids to interact withneurotransmitter receptors and modulate both synaptic plasticity andcognitive function suggests that Hct-1 and its metabolic product(s) mayregulate neuronal function in vivo.

[0142] 5. Summary

[0143] Hct-1 (hippocampal transcript) was detected in a differentialscreen of a rat hippocampal cDNA library. Expression of Hct-1 wasenriched in the formation but was also detected in rat liver and kidney,though at much lower levels; expression was barely detectable in testis,ovary and adrenal. In liver, unlike brain, expression was sexuallydimorphic: hepatic expression was greatly reduced in female rats. Inmouse, brain expression in was widespread, with the highest levels beingdetected in corpus callosum; only low levels were detected in liver.Sequence analysis of rat and mouse Hct-1 cDNAs revealed extensivehomologies with cytochrome P450's (CYP's), a diverse family ofheme-binding monooxygenases that metabolize a range of substratesincluding steroids, fatty acids and xenobiotics. Among the CYP's, Hct-1is most similar (39% at the amino acid sequence) to cholesterol7α-hydroxylase (CYP7), and contains the diagnostic steriodogenic domainpresent in other steriod-metabolizing CYPs, but clearly represents atype of CYP not previously reported. Genomic Southern analysis indicatesthat a single gene corresponding to Hct-1 is present in mouse, rat andhuman. Hct-1 is unusual in that, unlike all other CYP's described, theprimary site of expression is in the brain. Similarity to CYP7 and othersteroid-metabolizing CYP's argues that Hct-1 plays a role in steroidmetabolism in brain, notable because of the documented ability ofbrain-derived steroids (neurosteroids) to modulate cognitive function invivo.

[0144] 6. Details of Experimental Protocols

[0145] Northern analysis—Total RNA was extracted by tissuehomogenization in guanidinium thiocyanate according to a standardprocedure and further purified by centrifugation through a CsCl cushion.Where appropriate, polyA-plus RNA was selected on oligo-dt cellulose.Electrophoresis of RNA (10 μg) on 1% agarose in the presence of 7%formaldehyde was followed by capillary transfer to nylon membranes,baking (2 h, 80° C.), and rinsing in hybridization buffer (0.25 MNaPhosphate, pH 7.2; 1 mM EDTA, 7% sodium dodecyl sulphate [SDS], 1%bovine serum albumin) as described (Church et al., supra). Probes wereprepared by random-priming of DNA polymerase copying of denatureddouble-stranded DNA. Hybridization (16 h, 68° C.) was followed bywashing (3 times, 20 mM NaPhosphate pH 7.2, 1 mM EDTA, 1% SDS, 20 min.)and membranes exposed for autoradiography. The loading control probe wasa 0.5 kb cDNA encoding the ubiquitously expressed rat ribosomal proteinS26.

[0146] In situ Hybridization—Synthetic Hct-1 Oligonucleotide Probes5′-dGACAGGTTTTGTGACCCAAAACAAACTGGATGGATCGCAATC-3′ (rat, 55% G + C) and5′-ATCACGGAGCTCAGCACATGCAGCCTTACTCTGCAAAGCTTC3′ (mouse-48% G = C)

[0147] (mouse−48% G+C) were labelled using terminal transferase(Boehringer Mannheim) and α-³⁵S-dATP (Amersham) according to themanufacturer's instructions.

[0148] The control probe,5′-dAGCCTTCTGGGTCGTAGCTGACTCCTGCTGCTGAGCTGCAACAGCTTT-3′ (56% G+C) wasbased on human opsin cDNA. Frozen coronal 10 μm sections of brain werefixed (4% paraformaldehyde, 10 min), rinsed, treated with proteinase K(20 μg/ml in 50 mM Tris.HCl, pH 7.4, 5 mM EDTA, 5 min), rinsed, andrefixed with paraformaldehyde as before. Following acetylation (0.25%acetic anhydride, 10 min) and rinsing, sections were dehydrated bypassing though increasing ethanol concentrations (30, 50, 70, 85, 95,100, 100%, each for 1 minute except the 70% step [5 min]). FollowingCHCl₃ treatment (5 min), and rinsing in ethanol, sections were driedbefore hybridization. Hybridization in buffer (4× standard salinecitrate [1×SSC=0.15 M NaCl, 0.015 M Na₃citrate], 50% v/v formamide, 10%w/v dextran sulphate, 1× Denhardt's solution, 0.1% SDS, 500 μg/mldenatured salmon sperm DNA, 250 μg/ml yeast tRNA) was for 16 h at 37° C.Slides were washed (4×15 min., 1×SSC, 60° C.; 2×30 min., 1×SSC, 20° C.),dipped into photographic liquid emulsion (LM-1, Amersham), exposed anddeveloped according to the manufacturer's specifications. Slides werecounterstained with 1% methyl green.

[0149] Southern hybridization—Genomic DNA prepared from mouse or ratliver, or from human lymphocytes, was digested with the appropriaterestriction endonuclease, resolved by agarose gel electrophoresis (0.7%)and transferred to Hybond-N membranes. Following baking (2 h, 80° C.),hybridization conditions were as described for Northern analysis.

[0150] Hybridisation Conditions. Hybridisation conditions used werebased on those described by Church and Gilbert, Proc. Natl. Acad. Scl.USA (1984) 81, 1991-1995.

[0151] 1. Filters were pre-wet in 2×SSC.

[0152] 2. The hybridisation was performed in a rotating glass cylinder(Techne Hybridiser ovens). 10 ml of Hybridisation Buffer was added tothe cylinder with the filter.

[0153] 3. Prehybridisation and hybridisation were carried out at 68° C.unless otherwise specified.

[0154] 4. The filters were prehybridised for 30 minutes, after which theprobe was added directly and hybridisation proceeded overnight.(Double-stranded probes were denatured by boiling for 2 minutes, thenplacing on ice).

[0155] 5. Washes were performed at 68° C. (unless otherwise stated) with2 changes of Wash Buffer I for 10 minutes each, followed by threechanges of Wash Buffer II each for 20 minutes.

[0156] 6. The filters were blotted dry, but not allowed to dry out, thenplaced between Saran wrap, and against X-ray film for autoradiography.

[0157] Hybridisation Buffer:

[0158] 0.25 M sodium phosphate pH 7.2

[0159] 1 mM EDTA

[0160] 7% SDS

[0161] 1% BSA

[0162] Wash Buffer I:

[0163] 20 mM sodium phosphate pH 7.2

[0164] 2.5% SDS

[0165] 0.25% BSA

[0166] 1 mM EDTA

[0167] Wash Buffer II:

[0168] 20 mM sodium phosphate pH 7.2

[0169] 1 mM EDTA

[0170] 1% SDS

[0171] Screening of Bacteriophage lambda libraries. The rat hippocampuscDNA library was oligo-(dT)-NotI primed and cloned in lambda ZAP II(Stratagene) with an EcoRI adaptor at the 5′ end, and was prepared inthe lab by Miss M. Richardson and Dr. J. Mason; the mouse liver cDNAlibrary was oligo-(dT)-primed and cloned into lambda gt10 withEcoRI/NotI adaptors, and was a gift from Dr. B. Luckow, Heidelberg; themouse ES cell genomic library was cloned from a partial Sau3A digestinto lambda DASH II (Stratagene), and was a gift from A. Reaume,Toronto.

[0172] The libraries were screened as described above by hybridization.

[0173] In vivo excision of pBluescript from lambda ZAP II vector wasperformed using the ExAssist/SOLR system (Stratagene, 200253).

[0174] In situ hybridisation. Frozen 10μ coronal sections of rat andmouse brains were provided by Dr. M. Steel.

[0175] Hybridisation Conditions All probes were oligonucleotides whichwere labelled by homopolymer tailing using a-³⁵S-dATP and terminaltransferase.

[0176] The sequences or references of the oligonucleotides used asprobes for in situ hybridisation were as follows:

[0177] rat Hct-1 (a 45-mer, beginning 26 nt 5′ from the polyA tail,nucleotides 1361-1403 in FIG. 4.2) (for relative position in mouse gene,see FIG. 4.3) 5′-GACAGGTTTTGTGACCCAAAACAAACTGGATGGATCGCAATC-3′

[0178] Nathans mouse Hct-1 (nt 1558-1599)5′-ATCACGGAGCTCAGCACATGCAGCCTTACTCTGCAAAGCTTC-3′

[0179] rat clone 13 (a 42-mer, beginning 112 nt 5′ from polyA tail)5′-TATATCCATACCAACTTATTGGGAGTCCCATCCTACCTCATCAGC-3′

[0180] rat/mouse muscarinic receptor M1 (Buckley et al., 1988)

[0181] rat/mouse opsins (Nathans et al., Science (1986) 232, 193-202)

[0182] 1. The prepared ³⁵S-tailed probe (resuspended in 10 mM DTT in TE)was diluted to 2×10⁶ cpm/ml in hybridisation buffer. DTT is also addedto this mixture to a final concentration of 50 mM.

[0183] 2. 100 ml of the probe mixture was carefully layered onto eachmicroscope slide. A piece of parafilm cut to the size of the microscopeslide was then layered over the probe mixture, allowing the probe andhybridisation mixture to cover all the sections. Air bubbles under theparafilm were avoided.

[0184] 3. The slides were placed in a humidified container, sealed, andincubated at 37° C. overnight.

[0185] 4. After hybridisation, the parafilm was carefully removed usingforceps.

[0186] 5. The slides were placed back in Coplin jars, and the hybridisedsections washed in four changes of 1×SSC for 15 minutes at 55° C. or 60°C., and then two changes of 1×SSC for 30 minutes at room temperature.

[0187] 6. The slides were rinsed briefly in dH₂O, then left to air dry.

[0188] Hybridisation Buffer*:

[0189] 4×SSC

[0190] 50% (v/v) deionised formamide

[0191] 10% (w/v) dextran sulphate

[0192] 1× Denhardt's solution

[0193] 0.1% (w/v) SDS

[0194] 500 μg/ml ssDNA

[0195] 250 μg/ml yeast tRNA

[0196] *buffer was de-gassed before use

[0197] 7. Figure Legends

[0198]FIG. 1. Sequence of partial rat Hct-1 cDNA and the encodedpolypeptide. The nucleotide sequence and translation product of the 1.4kb cDNA clone 12 including additional clone 7 sequence (lower case). Thetwo putative polyadenylation signals are underlined.

[0199]FIG. 2. Northern analysis of Hct-1 expression in adult rat andmouse brain. Panel A. Expression in rat brain and other tissues; panelB. sexually dimorphic expression in rat liver; panel C. Expression inmouse tissues. Poly-A⁺ (A) or total B,C) RNA from organs of adultanimals were resolved by gel electrophoresis; the hybridization probewas rat Hct-1 cDNA clone 12 (1.4 kb), the probe for the loading control(below) corresponds to ribosomal protein S26. Tissues analysed are: Hi,hippocampus; RB, remainder of brain lacking hippocampus; Cx, cortex; Cb,cerebellum; Ob; olfactory bulb; Li, liver; He, heart; Th, thymus; Ki,kidney; Ov, ovary; Te, testis; Lu, lung.

[0200]FIG. 3. Mouse Hct-1 cDNA and the sequence of the encodedpolypeptide. The restriction map of the cDNA (above) corresponds to thecompilation of two independent clones sequenced; the cross-hatched boxindicates the coding region. The nucleotide sequence and translationproduct (below) derives from this compilation. Lower case sequencesindicate the 59 additional 5′ nucleotides in clone 40 and the 99additional 3′ nucleotides in clone 35. The putative polyadenylation siteis underlined.

[0201]FIG. 4. Alignment of mouse Hct-1 with human CYP7 (cholesterol7α-hydroxylase, Noshiro and Okuda, 1990) and other steroidogenic P450s.Panel A: Identical amino acids are indicated by a bar; hyphens in theamino acid sequences indicate gaps introduced during alignment. TheN-terminal hydrophobic leader sequences are underlined. The position ofthe conserved Thr residue within the O₂-binding pocket of other CYP's(43), but replaced by Asn in Hct-1 (position 294) and CYP7, is indicatedby an asterisk. Panels B,C: conserved residues in the heme-binding(residues 440-453, B) and steroidogenic (residues 348-362, C) domainsconserved between Hct-1 and other similar CYP's (overlined in A).Sequences are human CYP7 (7α-hydroxylase; 37); bovine CYP17(17α-hydroxylase; 44); human CYP11B1 (steroid β-hydroxylase; 45); humanCYP21B (21-hydroxylase; 11); human CYP11A1 (P450scc; cholesterolside-chain cleavage; 46); human CYP27 (27-hydroxylase; 47).

[0202]FIG. 5. Analysis of Hct-1 expression in adult mouse brain. Thehybridization probe was a synthetic oligonucleotide corresponding to the3′ untranslated region of mouse Hct-1 cDNA. Panel a: coronal section;panel b: coronal section, rostral to a, showing hybridization in corpuscallosum, cc; fornix, f; and anterior commissure, ac; panel c:enlargement of section through the hippocampus; DG, dentate gyrus; paneld: section adjacent to the section in a hybridized with anoligonucleotide specific for opsin (negative control).

[0203]FIG. 6. Southern analysis of Hct-1 coding sequences in mouse, ratand human Total DNA was cleaved as indicated with restrictionendonucleases B, BamHI; E, EcoRI; H, HindIII; X, XbaI; resolved byagarose gel electrophoresis, and probed with rat Hct-1 cDNA clone 12before exposure to autoradiography.

[0204]FIG. 7 Genomic DNA Southern blot analysis of Hct-1 (a) Mousegenomic DNA probed with the full-length mouse Hct-1 cDNA clone. (b) Ratgenomic DNA probed with clone 14.5a (original 0.3 kb clone of rHct-1).10 μg of genomic DNA was digested with the indicated enzymes.

[0205]FIG. 8 Genomic map of mouse Hct-1 (incomplete). Exons II, III, IVand VI are represented on the phage clones (filled boxes). Exons I and Vare not located. As indicated in Table 4.1, the boundaries of exons II,III B (BamHI); H(HindIII); S(SacI); X(XhoI)

1 45 1763 base pairs nucleic acid single linear cDNA rat CDS 1..1242 1GCC TTG GAG TAC CAG TAT GTA ATG AAA AAC CCA AAA CAA TTA AGC TTT 48 AlaLeu Glu Tyr Gln Tyr Val Met Lys Asn Pro Lys Gln Leu Ser Phe 1 5 10 15GAG AAG TTC AGC CGA AGA TTA TCA GCG AAA GCC TTC TCT GTC AAG AAG 96 GluLys Phe Ser Arg Arg Leu Ser Ala Lys Ala Phe Ser Val Lys Lys 20 25 30 CTGCTA ACT AAT GAC GAC CTT AGC AAT GAC ATT CAC AGA GGC TAT CTT 144 Leu LeuThr Asn Asp Asp Leu Ser Asn Asp Ile His Arg Gly Tyr Leu 35 40 45 CTT TTACAA GGC AAA TCT CTG GAT GGT CTT CTG GAA ACC ATG ATC CAA 192 Leu Leu GlnGly Lys Ser Leu Asp Gly Leu Leu Glu Thr Met Ile Gln 50 55 60 GAA GTA AAAGAA ATA TTT GAG TCC AGA CTG CTA AAA CTC ACA GAT TGG 240 Glu Val Lys GluIle Phe Glu Ser Arg Leu Leu Lys Leu Thr Asp Trp 65 70 75 80 AAT ACA GCAAGA GTA TTT GAT TTC TGT AGT TCA CTG GTA TTT GAA ATC 288 Asn Thr Ala ArgVal Phe Asp Phe Cys Ser Ser Leu Val Phe Glu Ile 85 90 95 ACA TTT ACA ACTATA TAT GGA AAA ATT CTT GCT GCT AAC AAA AAA CAA 336 Thr Phe Thr Thr IleTyr Gly Lys Ile Leu Ala Ala Asn Lys Lys Gln 100 105 110 ATT ATC AGT GAGCTG AGG GAT GAT TTT TTA AAA TTT GAT GAC CAT TTC 384 Ile Ile Ser Glu LeuArg Asp Asp Phe Leu Lys Phe Asp Asp His Phe 115 120 125 CCA TAC TTA GTATCT GAC ATA CCT ATT CAG CTT CTA AGA AAT GCA GAA 432 Pro Tyr Leu Val SerAsp Ile Pro Ile Gln Leu Leu Arg Asn Ala Glu 130 135 140 TTT ATG CAG AAGAAA ATT ATA AAA TGT CTC ACA CCA GAA AAA GTA GCT 480 Phe Met Gln Lys LysIle Ile Lys Cys Leu Thr Pro Glu Lys Val Ala 145 150 155 160 CAG ATG CAAAGA CGG TCA GAA ATT GTT CAG GAG AGG CAG GAG ATG CTG 528 Gln Met Gln ArgArg Ser Glu Ile Val Gln Glu Arg Gln Glu Met Leu 165 170 175 AAA AAA TACTAC GGG CAT GAA GAG TTT GAA ATA GGA GCA CAT CAT CTT 576 Lys Lys Tyr TyrGly His Glu Glu Phe Glu Ile Gly Ala His His Leu 180 185 190 GGC TTG CTCTGG GCC TCT CTA GCA AAC ACC ATT CCA GCT ATG TTC TGG 624 Gly Leu Leu TrpAla Ser Leu Ala Asn Thr Ile Pro Ala Met Phe Trp 195 200 205 GCA ATG TATTAT CTT CTT CAG CAT CCA GAA GCT ATG GAA GTC CTG CGT 672 Ala Met Tyr TyrLeu Leu Gln His Pro Glu Ala Met Glu Val Leu Arg 210 215 220 GAC GAA ATTGAC AGC TTC CTG CAG TCA ACA GGT CAA AAG AAA GGA CCT 720 Asp Glu Ile AspSer Phe Leu Gln Ser Thr Gly Gln Lys Lys Gly Pro 225 230 235 240 GGA ATTTCT GTC CAC TTC ACC AGA GAA CAA TTG GAC AGC TTG GTC TGC 768 Gly Ile SerVal His Phe Thr Arg Glu Gln Leu Asp Ser Leu Val Cys 245 250 255 CTG GAAAGC GCT ATT CTT GAG GTT CTG AGG TTG TGC TCC TAC TCC AGC 816 Leu Glu SerAla Ile Leu Glu Val Leu Arg Leu Cys Ser Tyr Ser Ser 260 265 270 ATC ATCCGT GAA GTG CAA GAG GAT ATG GAT TTC AGC TCA GAG AGT AGG 864 Ile Ile ArgGlu Val Gln Glu Asp Met Asp Phe Ser Ser Glu Ser Arg 275 280 285 AGC TACCGT CTG CGG AAA GGA GAC TTT GTA GCT GTC TTT CCT CCA ATG 912 Ser Tyr ArgLeu Arg Lys Gly Asp Phe Val Ala Val Phe Pro Pro Met 290 295 300 ATA CACAAT GAC CCA GAA GTC TTC GAT GCT CCA AAG GAC TTT AGG TTT 960 Ile His AsnAsp Pro Glu Val Phe Asp Ala Pro Lys Asp Phe Arg Phe 305 310 315 320 GATCGC TTC GTA GAA GAT GGT AAG AAG AAA ACA ACG TTT TTC AAA GGA 1008 Asp ArgPhe Val Glu Asp Gly Lys Lys Lys Thr Thr Phe Phe Lys Gly 325 330 335 GGAAAA AAG CTG AAG AGT TAC ATT ATA CCA TTT GGA CTT GGA ACA AGC 1056 Gly LysLys Leu Lys Ser Tyr Ile Ile Pro Phe Gly Leu Gly Thr Ser 340 345 350 AAATGT CCA GGC AGA TAC TTT GCA ATT AAT GAA ATG AAG CTA CTA GTG 1104 Lys CysPro Gly Arg Tyr Phe Ala Ile Asn Glu Met Lys Leu Leu Val 355 360 365 ATTATA CTT TTA ACT TAT TTT GAT TTA GAA GTC ATT GAC ACT AAG CCT 1152 Ile IleLeu Leu Thr Tyr Phe Asp Leu Glu Val Ile Asp Thr Lys Pro 370 375 380 ATAGGA CTA AAC CAC AGT CGC ATG TTT CTG GGC ATT CAG CAT CCA GAC 1200 Ile GlyLeu Asn His Ser Arg Met Phe Leu Gly Ile Gln His Pro Asp 385 390 395 400TCT GAC ATC TCA TTT AGG TAC AAG GCA AAA TCT TGG AGA TCC 1242 Ser Asp IleSer Phe Arg Tyr Lys Ala Lys Ser Trp Arg Ser 405 410 TGAAAGGGTGGCAGAGAAGC TTAGCGGAAT AAGGCTGCAC ATGCTGAGCT CTGTGATT 1302 CTGTACTCCCCAAATGCAGC CACTATTCTT GTTTGTTAGA AAATGGCAAA TTTTTATT 1362 ATTGCGATCCATCCAGTTTG TTTTGGGTCA CAAAACCTGT CATAAAATAA AGCGCTGT 1422 TGGTGTAAAAAAATGTCATG GCAATCATTT CAGGATAAGG TAAAATAACG TTTTCAAG 1482 TGTACTTACTATGATTTTTA TCATTTGTAG TGAATGTGCT TTTCCAGTAA TAAATTTG 1542 CCAGGGTGATTTTTTTTAAT TACTGAAATC CTCTAATATC GGTTTTATGT GCTGCCAG 1602 AACTCTGCCATCAATGGACA GTATAACAAT TTCCAGTTTT CCAGAGAAGG GAGAAATT 1662 GCCCCATGAGTTACGCTGTA TAAAATTGTT CTCTTCAACT ATAATATCAA TAATGTCT 1722 ATCACCAGGTTACCTTTGCA TTAAATCGAG TTTTGCAAAA G 1763 414 amino acids amino acidlinear protein 2 Ala Leu Glu Tyr Gln Tyr Val Met Lys Asn Pro Lys Gln LeuSer Phe 1 5 10 15 Glu Lys Phe Ser Arg Arg Leu Ser Ala Lys Ala Phe SerVal Lys Lys 20 25 30 Leu Leu Thr Asn Asp Asp Leu Ser Asn Asp Ile His ArgGly Tyr Leu 35 40 45 Leu Leu Gln Gly Lys Ser Leu Asp Gly Leu Leu Glu ThrMet Ile Gln 50 55 60 Glu Val Lys Glu Ile Phe Glu Ser Arg Leu Leu Lys LeuThr Asp Trp 65 70 75 80 Asn Thr Ala Arg Val Phe Asp Phe Cys Ser Ser LeuVal Phe Glu Ile 85 90 95 Thr Phe Thr Thr Ile Tyr Gly Lys Ile Leu Ala AlaAsn Lys Lys Gln 100 105 110 Ile Ile Ser Glu Leu Arg Asp Asp Phe Leu LysPhe Asp Asp His Phe 115 120 125 Pro Tyr Leu Val Ser Asp Ile Pro Ile GlnLeu Leu Arg Asn Ala Glu 130 135 140 Phe Met Gln Lys Lys Ile Ile Lys CysLeu Thr Pro Glu Lys Val Ala 145 150 155 160 Gln Met Gln Arg Arg Ser GluIle Val Gln Glu Arg Gln Glu Met Leu 165 170 175 Lys Lys Tyr Tyr Gly HisGlu Glu Phe Glu Ile Gly Ala His His Leu 180 185 190 Gly Leu Leu Trp AlaSer Leu Ala Asn Thr Ile Pro Ala Met Phe Trp 195 200 205 Ala Met Tyr TyrLeu Leu Gln His Pro Glu Ala Met Glu Val Leu Arg 210 215 220 Asp Glu IleAsp Ser Phe Leu Gln Ser Thr Gly Gln Lys Lys Gly Pro 225 230 235 240 GlyIle Ser Val His Phe Thr Arg Glu Gln Leu Asp Ser Leu Val Cys 245 250 255Leu Glu Ser Ala Ile Leu Glu Val Leu Arg Leu Cys Ser Tyr Ser Ser 260 265270 Ile Ile Arg Glu Val Gln Glu Asp Met Asp Phe Ser Ser Glu Ser Arg 275280 285 Ser Tyr Arg Leu Arg Lys Gly Asp Phe Val Ala Val Phe Pro Pro Met290 295 300 Ile His Asn Asp Pro Glu Val Phe Asp Ala Pro Lys Asp Phe ArgPhe 305 310 315 320 Asp Arg Phe Val Glu Asp Gly Lys Lys Lys Thr Thr PhePhe Lys Gly 325 330 335 Gly Lys Lys Leu Lys Ser Tyr Ile Ile Pro Phe GlyLeu Gly Thr Ser 340 345 350 Lys Cys Pro Gly Arg Tyr Phe Ala Ile Asn GluMet Lys Leu Leu Val 355 360 365 Ile Ile Leu Leu Thr Tyr Phe Asp Leu GluVal Ile Asp Thr Lys Pro 370 375 380 Ile Gly Leu Asn His Ser Arg Met PheLeu Gly Ile Gln His Pro Asp 385 390 395 400 Ser Asp Ile Ser Phe Arg TyrLys Ala Lys Ser Trp Arg Ser 405 410 1880 base pairs nucleic acid singlelinear cDNA mouse CDS 81..1601 3 GGCAGGCACA GCCTCTGGTC TAAGAAGAGAGGGCACTGTG CAAAAGCCAT CGCTCCCTAC 60 AGAGCCGCCA GCTCGTCGGG ATG CAG GGAGCC ACG ACC CTA GAT GCC GCC 110 Met Gln Gly Ala Thr Thr Leu Asp Ala Ala1 5 10 TCG CCA GGG CCT CTC GCC CTC CTA GGC CTT CTC TTT GCC GCC ACC TTA158 Ser Pro Gly Pro Leu Ala Leu Leu Gly Leu Leu Phe Ala Ala Thr Leu 1520 25 CTG CTC TCG GCC CTG TTC CTC CTC ACC CGG CGC ACC AGG CGC CCT CGT206 Leu Leu Ser Ala Leu Phe Leu Leu Thr Arg Arg Thr Arg Arg Pro Arg 3035 40 GAA CCA CCC TTG ATA AAA GGT TGG CTT CCT TAT CTT GGC ATG GCC CTG254 Glu Pro Pro Leu Ile Lys Gly Trp Leu Pro Tyr Leu Gly Met Ala Leu 4550 55 AAA TTC TTT AAG GAT CCG TTA ACT TTC TTG AAA ACT CTT CAA AGG CAA302 Lys Phe Phe Lys Asp Pro Leu Thr Phe Leu Lys Thr Leu Gln Arg Gln 6065 70 CAT GGT GAC ACT TTC ACT GTC TTC CTT GTG GGG AAG TAT ATA ACA TTT350 His Gly Asp Thr Phe Thr Val Phe Leu Val Gly Lys Tyr Ile Thr Phe 7580 85 90 GTT CTG AAC CCT TTC CAG TAC CAG TAT GTA ACG AAA AAC CCA AAA CAA398 Val Leu Asn Pro Phe Gln Tyr Gln Tyr Val Thr Lys Asn Pro Lys Gln 95100 105 TTA AGC TTT CAG AAG TTC AGC AGC CGA TTA TCA GCG AAA GCC TTC TCT446 Leu Ser Phe Gln Lys Phe Ser Ser Arg Leu Ser Ala Lys Ala Phe Ser 110115 120 GTA AAG AAG CTG CTT ACT GAT GAC GAC CTT AAT GAA GAC GTT CAC AGA494 Val Lys Lys Leu Leu Thr Asp Asp Asp Leu Asn Glu Asp Val His Arg 125130 135 GCC TAT CTA CTT CTA CAA GGC AAA CCT TTG GAT GCT CTT CTG GAA ACT542 Ala Tyr Leu Leu Leu Gln Gly Lys Pro Leu Asp Ala Leu Leu Glu Thr 140145 150 ATG ATC CAA GAA GTA AAA GAA TTA TTT GAG TCC CAA CTG CTA AAA ATC590 Met Ile Gln Glu Val Lys Glu Leu Phe Glu Ser Gln Leu Leu Lys Ile 155160 165 170 ACA GAT TGG AAC ACA GAA AGA ATA TTT GCA TTC TGT GGC TCA CTGGTA 638 Thr Asp Trp Asn Thr Glu Arg Ile Phe Ala Phe Cys Gly Ser Leu Val175 180 185 TTT GAG ATC ACA TTT GCG ACT CTA TAT GGA AAA ATT CTT GCT GGTAAC 686 Phe Glu Ile Thr Phe Ala Thr Leu Tyr Gly Lys Ile Leu Ala Gly Asn190 195 200 AAG AAA CAA ATT ATC AGT GAG CTA AGG GAT GAT TTT TTT AAA TTTGAT 734 Lys Lys Gln Ile Ile Ser Glu Leu Arg Asp Asp Phe Phe Lys Phe Asp205 210 215 GAC ATG TTC CCA TAC TTA GTA TCT GAC ATA CCT ATT CAG CTT CTAAGA 782 Asp Met Phe Pro Tyr Leu Val Ser Asp Ile Pro Ile Gln Leu Leu Arg220 225 230 AAT GAA GAA TCT ATG CAG AAG AAA ATT ATA AAA TGC CTC ACA TCAGAA 830 Asn Glu Glu Ser Met Gln Lys Lys Ile Ile Lys Cys Leu Thr Ser Glu235 240 245 250 AAA GTA GCT CAG ATG CAA GGA CAG TCA AAA ATT GTT CAG GAAAGC CAA 878 Lys Val Ala Gln Met Gln Gly Gln Ser Lys Ile Val Gln Glu SerGln 255 260 265 GAT CTG CTG AAA AGA TAC TAT AGG CAT GAC GAT TCT GAA ATAGGA GCA 926 Asp Leu Leu Lys Arg Tyr Tyr Arg His Asp Asp Ser Glu Ile GlyAla 270 275 280 CAT CAT CTT GGC TTT CTC TGG GCC TCT CTA GCA AAC ACC ATTCCA GCT 974 His His Leu Gly Phe Leu Trp Ala Ser Leu Ala Asn Thr Ile ProAla 285 290 295 ATG TTC TGG GCA ATG TAT TAT ATT CTT CGG CAT CCT GAA GCTATG GAA 1022 Met Phe Trp Ala Met Tyr Tyr Ile Leu Arg His Pro Glu Ala MetGlu 300 305 310 GCC CTG CGT GAC GAA ATT GAC AGT TTC CTG CAG TCA ACA GGTCAA AAG 1070 Ala Leu Arg Asp Glu Ile Asp Ser Phe Leu Gln Ser Thr Gly GlnLys 315 320 325 330 AAA GGG CCT GGA ATT TCA GTC CAC TTC ACC AGA GAA CAATTG GAC AGC 1118 Lys Gly Pro Gly Ile Ser Val His Phe Thr Arg Glu Gln LeuAsp Ser 335 340 345 TTG GTC TGC CTG GAA AGC ACT ATT CTT GAG GTT CTG AGGCTG TGC TCA 1166 Leu Val Cys Leu Glu Ser Thr Ile Leu Glu Val Leu Arg LeuCys Ser 350 355 360 TAC TCC AGC ATC ATC CGA GAA GTG CAG GAG GAT ATG AATCTC AGC TTA 1214 Tyr Ser Ser Ile Ile Arg Glu Val Gln Glu Asp Met Asn LeuSer Leu 365 370 375 GAG AGT AAG AGT TTC TCT CTG CGG AAA GGA GAT TTT GTAGCC CTC TTT 1262 Glu Ser Lys Ser Phe Ser Leu Arg Lys Gly Asp Phe Val AlaLeu Phe 380 385 390 CCT CCA CTC ATA CAC AAT GAC CCG GAA ATC TTC GAT GCTCCA AAG GAA 1310 Pro Pro Leu Ile His Asn Asp Pro Glu Ile Phe Asp Ala ProLys Glu 395 400 405 410 TTT AGG TTC GAT CGG TTC ATA GAA GAT GGT AAG AAGAAA AGC ACG TTT 1358 Phe Arg Phe Asp Arg Phe Ile Glu Asp Gly Lys Lys LysSer Thr Phe 415 420 425 TTC AAA GGA GGG AAG AGG CTG AAG ACT TAC GTT ATGCCT TTT GGA CTC 1406 Phe Lys Gly Gly Lys Arg Leu Lys Thr Tyr Val Met ProPhe Gly Leu 430 435 440 GGA ACA AGC AAA TGT CCA GGG AGA TAT TTT GCA GTGAAC GAA ATG AAG 1454 Gly Thr Ser Lys Cys Pro Gly Arg Tyr Phe Ala Val AsnGlu Met Lys 445 450 455 CTA CTG CTG ATT GAG CTT TTA ACT TAT TTT GAT TTAGAA ATT ATC GAC 1502 Leu Leu Leu Ile Glu Leu Leu Thr Tyr Phe Asp Leu GluIle Ile Asp 460 465 470 AGG AAG CCT ATA GGG CTA AAT CAC AGT CGG ATG TTTTTA GGT ATT CAG 1550 Arg Lys Pro Ile Gly Leu Asn His Ser Arg Met Phe LeuGly Ile Gln 475 480 485 490 CAC CCC GAT TCT GCC GTC TCC TTT AGG TAC AAAGCA AAA TCT TGG AGA 1598 His Pro Asp Ser Ala Val Ser Phe Arg Tyr Lys AlaLys Ser Trp Arg 495 500 505 AGC TGAAAGTGTG GCAGAGAAGC TTTGCAGAGTAAGGCTGCAT GTGCTGAGCT 1651 Ser CCGTGATTTG GTGCACTCCC CCAAATGCAACCGCTACTCT TGTTTGAAAA TGGCAAAT 1711 ATATTTGGTT GAGATCAATC CAGTTGGTTTTGGGTCACAA AACCTGTCAT AAAATAAA 1771 AGTGTGATGG TTTAAAAAAT GTCATGGCAATCATTTCAGG ATAAGGTAAA ATAACATT 1831 CAAGTTTGTA CTTACTATGA TTTTTATCATTTGTAGTGAA TGTGCTTTT 1880 507 amino acids amino acid linear protein 4Met Gln Gly Ala Thr Thr Leu Asp Ala Ala Ser Pro Gly Pro Leu Ala 1 5 1015 Leu Leu Gly Leu Leu Phe Ala Ala Thr Leu Leu Leu Ser Ala Leu Phe 20 2530 Leu Leu Thr Arg Arg Thr Arg Arg Pro Arg Glu Pro Pro Leu Ile Lys 35 4045 Gly Trp Leu Pro Tyr Leu Gly Met Ala Leu Lys Phe Phe Lys Asp Pro 50 5560 Leu Thr Phe Leu Lys Thr Leu Gln Arg Gln His Gly Asp Thr Phe Thr 65 7075 80 Val Phe Leu Val Gly Lys Tyr Ile Thr Phe Val Leu Asn Pro Phe Gln 8590 95 Tyr Gln Tyr Val Thr Lys Asn Pro Lys Gln Leu Ser Phe Gln Lys Phe100 105 110 Ser Ser Arg Leu Ser Ala Lys Ala Phe Ser Val Lys Lys Leu LeuThr 115 120 125 Asp Asp Asp Leu Asn Glu Asp Val His Arg Ala Tyr Leu LeuLeu Gln 130 135 140 Gly Lys Pro Leu Asp Ala Leu Leu Glu Thr Met Ile GlnGlu Val Lys 145 150 155 160 Glu Leu Phe Glu Ser Gln Leu Leu Lys Ile ThrAsp Trp Asn Thr Glu 165 170 175 Arg Ile Phe Ala Phe Cys Gly Ser Leu ValPhe Glu Ile Thr Phe Ala 180 185 190 Thr Leu Tyr Gly Lys Ile Leu Ala GlyAsn Lys Lys Gln Ile Ile Ser 195 200 205 Glu Leu Arg Asp Asp Phe Phe LysPhe Asp Asp Met Phe Pro Tyr Leu 210 215 220 Val Ser Asp Ile Pro Ile GlnLeu Leu Arg Asn Glu Glu Ser Met Gln 225 230 235 240 Lys Lys Ile Ile LysCys Leu Thr Ser Glu Lys Val Ala Gln Met Gln 245 250 255 Gly Gln Ser LysIle Val Gln Glu Ser Gln Asp Leu Leu Lys Arg Tyr 260 265 270 Tyr Arg HisAsp Asp Ser Glu Ile Gly Ala His His Leu Gly Phe Leu 275 280 285 Trp AlaSer Leu Ala Asn Thr Ile Pro Ala Met Phe Trp Ala Met Tyr 290 295 300 TyrIle Leu Arg His Pro Glu Ala Met Glu Ala Leu Arg Asp Glu Ile 305 310 315320 Asp Ser Phe Leu Gln Ser Thr Gly Gln Lys Lys Gly Pro Gly Ile Ser 325330 335 Val His Phe Thr Arg Glu Gln Leu Asp Ser Leu Val Cys Leu Glu Ser340 345 350 Thr Ile Leu Glu Val Leu Arg Leu Cys Ser Tyr Ser Ser Ile IleArg 355 360 365 Glu Val Gln Glu Asp Met Asn Leu Ser Leu Glu Ser Lys SerPhe Ser 370 375 380 Leu Arg Lys Gly Asp Phe Val Ala Leu Phe Pro Pro LeuIle His Asn 385 390 395 400 Asp Pro Glu Ile Phe Asp Ala Pro Lys Glu PheArg Phe Asp Arg Phe 405 410 415 Ile Glu Asp Gly Lys Lys Lys Ser Thr PhePhe Lys Gly Gly Lys Arg 420 425 430 Leu Lys Thr Tyr Val Met Pro Phe GlyLeu Gly Thr Ser Lys Cys Pro 435 440 445 Gly Arg Tyr Phe Ala Val Asn GluMet Lys Leu Leu Leu Ile Glu Leu 450 455 460 Leu Thr Tyr Phe Asp Leu GluIle Ile Asp Arg Lys Pro Ile Gly Leu 465 470 475 480 Asn His Ser Arg MetPhe Leu Gly Ile Gln His Pro Asp Ser Ala Val 485 490 495 Ser Phe Arg TyrLys Ala Lys Ser Trp Arg Ser 500 505 3846 base pairs nucleic acid singlelinear cDNA human CDS join(831..1422, 1873..2078) intron 1..830 exon831..1422 intron 1423..1872 exon 1873..2078 intron 2079..3846 5GGATCCAACC AAGTTTCCAG ATCTTATAAA TGTGGTGAAT GGTGAATGAC TTCCTGAAGA 60ATGGATGAAT GGATGTGTTC TAGTTTGGAA TCCTGTGTCA GTCACAAGTC AATATGTGA 120CTTGAACATG TTATTAAATC TCCCACATCC ATAAAAGTGA AAATGCTGGC ATTAGTGGA 180TTTTGCCAGT GTTGAATTAG ACATTTATTT GTGAGTACCT GCTCCATACA GTATGGTCA 240TTATTTGAGT TAAAATTGTT GTATTTGAAC AAAACTCAGA TGACACCTAA GCATGAAAA 300GCTCTTTATG AAGTATAAAT ACTCAGAAAT GGAATGGCAT GTTGCCAATT TGTTTTCTG 360TTTATTGAGG GAAATATATG AGAAGTATTT AAGTCAGGGG ATTATGAGGA ATATTTAAA 420GATANNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNN 480NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNN 540NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNN 600NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNTCTAGA GTGTTTTCCA CCATCTTTC 660AAGGAAACAT GTAGTGTACC TTCGAATGAA ATGGATTTGT ATTAAACTTT TTGCCTTAG 720TATTAGGGTC TTTCTAATTT TTGATTAACA TATTTTTTTA ATTTGTGGTG TTTATTTCT 780TTTTTATTAA CAAACGAACT CATATGCTCC TCTCTCTTTT TTTTTTTTCT GGA AAG 836 GlyLys 1 TAC ATA ACA TTT ATA CCT GGA CCC TTC CAG TAC CAG CTA GTG ATA AAA884 Tyr Ile Thr Phe Ile Pro Gly Pro Phe Gln Tyr Gln Leu Val Ile Lys 5 1015 AAT CAT AAA CAA TTA AGC TTT CGA GTA TCT TCT AAT AAA TTA TCA GAG 932Asn His Lys Gln Leu Ser Phe Arg Val Ser Ser Asn Lys Leu Ser Glu 20 25 30AAA GCA TTT AGC ATC AGT CAG TTG CAA AAA AAT CAT GAC ATG AAT GAT 980 LysAla Phe Ser Ile Ser Gln Leu Gln Lys Asn His Asp Met Asn Asp 35 40 45 50GAG CTT CAC CTC TGC TAT CAA TTT TTG CAA GGC AAA TCT TTG GAC ATA 1028 GluLeu His Leu Cys Tyr Gln Phe Leu Gln Gly Lys Ser Leu Asp Ile 55 60 65 CTCTTG GAA AGC ATG ATG CAG AAT CTA AAA CAA GTT TTT GAA CCC CAG 1076 Leu LeuGlu Ser Met Met Gln Asn Leu Lys Gln Val Phe Glu Pro Gln 70 75 80 CTG TTAAAA ACC ACA AGT TGG GAC ACG GCA GAA CTG TAT CCA TTC TGC 1124 Leu Leu LysThr Thr Ser Trp Asp Thr Ala Glu Leu Tyr Pro Phe Cys 85 90 95 AGC TCA ATAATA TTT GAG ATC ACA TTT ACA ACT ATA TAT GGA AAA GTT 1172 Ser Ser Ile IlePhe Glu Ile Thr Phe Thr Thr Ile Tyr Gly Lys Val 100 105 110 ATT GTT TGTGAC AAC AAC AAA TTT ATT AGT GAG CTA AGA GAT GAT TTT 1220 Ile Val Cys AspAsn Asn Lys Phe Ile Ser Glu Leu Arg Asp Asp Phe 115 120 125 130 TTA AAATTT GAT GAC AAG TTT GCA TAT TTA GTA TCC AAC ATA CCC ATT 1268 Leu Lys PheAsp Asp Lys Phe Ala Tyr Leu Val Ser Asn Ile Pro Ile 135 140 145 GAG CTTCTA GGA AAT GTC AAG TCT ATT AGA GAG AAA ATT ATA AAA TGC 1316 Glu Leu LeuGly Asn Val Lys Ser Ile Arg Glu Lys Ile Ile Lys Cys 150 155 160 TTC TCATCA GAA AAG TTA GCC AAG ATG CAA GGA TGG TCA GAA GTT TTT 1364 Phe Ser SerGlu Lys Leu Ala Lys Met Gln Gly Trp Ser Glu Val Phe 165 170 175 CAA AGCAGG CAA GAT GAC CTG GAG AAA TAT TAT GTG CAC GAG GAC CTT 1412 Gln Ser ArgGln Asp Asp Leu Glu Lys Tyr Tyr Val His Glu Asp Leu 180 185 190 GAA ATAGGA G GTAAGAACTT CTGAATGAGC ACTTGCCTAA ATAAAAATCA 1462 Glu Ile Gly 195TTTACATAGA CCTCTGAAAT AAAAAAAGAC AAAATGGCGA CCTTGAAAAT TTTTTTAT 1522TCTTTCTAAT TGGCTAATGA TAAATGTTTA CTCTGATATA ACCTCTATAA TTGATATT 1582TTTTTTTGCT GAGGTGGTAA ACAGATACTT AATGGTGATA ATGAGAAAGC GTATAACT 1642GCTGCATTTA TCCCTCTTAT CTCATCCCCG ACCACACCGC CCCCCCCATA CACATTAC 1702TTTAAACTAT TCTCATTAAG CAGAAAATTA GACTTCAGAA GCCTATTGGT TCTCATTA 1762ATGCAGTGAT CCTTGGCTGG TCTGTGTCCT AACATCTTTT AATTAGCACA CTGCAAAT 1822AATCAGTGTA ATAAACGCTA TTAATCTTCC TTTACACTTA TTTTCTCCCA CA CAT 1877 AlaHis CAT TTA GGC TTT CTC TGG GCC TCT GTG GCA AAC ACT ATT CCA ACT ATG 1925His Leu Gly Phe Leu Trp Ala Ser Val Ala Asn Thr Ile Pro Thr Met 200 205210 215 TTC TGG GCA ACG TAT TAT CTT CTG CGG CAC CCA GAA GCT ATG GCA GCA1973 Phe Trp Ala Thr Tyr Tyr Leu Leu Arg His Pro Glu Ala Met Ala Ala 220225 230 GTG CGT GAC GAA ATT GAC CGT TTG CTG CAG TCA ACA GGT CAA AAG GAA2021 Val Arg Asp Glu Ile Asp Arg Leu Leu Gln Ser Thr Gly Gln Lys Glu 235240 245 GGG TCT GGA TTT CCC ATC CAC CTC ACC AGA GAA CAA TTG GAC AGC CTA2069 Gly Ser Gly Phe Pro Ile His Leu Thr Arg Glu Gln Leu Asp Ser Leu 250255 260 ATC TGC CTA GGTAATTATT TTATCTGTTA TGAAGAAAGA AGGTACCTCT 2118 IleCys Leu 265 CTGCAAACTC GGTTTATCAC TCATAGCTGT TTACAAGAGG TAGAGGACACAGCTGCTA 2178 TGACATAATA ACTCCCATTT ACATCAATTA TAAATTATGT AGTTTATAGCCGTAGATC 2238 CTCATTGCAT GTAAACATAA GGCCTANGTA ATTAACTGTG NAANGTATGNAAAANNCT 2298 CCAAAGCTTN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNNNNNNNNNN 2358 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNNNNNNNNNN 2418 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNNNNNNNNNN 2478 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNNNNNNNNNN 2538 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNNNNNNNNNN 2598 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNNNNNNNNNN 2658 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNNNNNNNNNN 2718 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNNNNNNNNNN 2778 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNNNNNNNNNN 2838 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNNNNNNNNNN 2898 NNNNNNNNNN NNNNNNNNNC CTGACTGAAC TTCTTACTGC CAAAGTTAAATTCCATAC 2958 ATGAGTTATT CTCTATTCTC TCTGTATTGA CATTTCATCT GCGGTATCCTTTAGGGTA 3018 ATATTCCAAG TTTCTTTAGA CAAACGCAGG AACAAATGTT CACATATTTCTGTTTCTT 3078 TTCCTTTGAC AAGTAGGCGA GCATTTTAGC CTATGTTGGT CTCAAAAAAAATCTTTTA 3138 TATGTTCCAG GTTCTTTAAT GGGACCTTTC AGGAGCAAAA GTCCTCCCAGGTTTGGTC 3198 TGTTCACCCT CNGTGGCCAT TGAGGAAAAT GCCCNNNNNG TTCTAGAGATTGTTCTCA 3258 TCTCAGGCTA AGGCCCATTG AGCAATGCCA GAAAGCATGC CTTATACTAGCAGTCAAT 3318 GGAAGTTTGT AGTTTGTGTC TTTAGCATAG GTTATCAAAT AAATTTTATATTTNCTTT 3378 AAAAAATCTC AACATTACTA AAATACAAAT ATCCTTTTAT TTTTCTTTGCAGAATTAT 3438 GGGAACAAAT CCAGAAAATT TGTGTAAATT TCGGGTAGTT GCTCCACTTGATACACAG 3498 TTTCTGCATA TTGTAATTTC TATGAAGATC TAGGTTGCAT TTCCCATACATTCAAGCA 3558 TTCCATTGCA TTTTTATGAA TAAGATGACG CATACTGGGA AGTAAGGCAAATACACTA 3618 AGGAATATGT GTTTGTATTC TGTATAGTTA TTACTCTTAA AAAAAGTAGTTGTAATTC 3678 CCACTCTTTT TACTTTCAAC TTTTTGCTAT TAAAAAATCA TTTTTAAATTTCAGTATT 3738 AGCAGAAACA TTTAAATTTA TTAGACCAGA AAAATAACAG ATTCTAGAACTATAATTT 3798 ATCCATTTAA GCCCATAGCT AGAGCTAGAG ATTTTCACTA TTGGATCC 3846266 amino acids amino acid linear protein 6 Gly Lys Tyr Ile Thr Phe IlePro Gly Pro Phe Gln Tyr Gln Leu Val 1 5 10 15 Ile Lys Asn His Lys GlnLeu Ser Phe Arg Val Ser Ser Asn Lys Leu 20 25 30 Ser Glu Lys Ala Phe SerIle Ser Gln Leu Gln Lys Asn His Asp Met 35 40 45 Asn Asp Glu Leu His LeuCys Tyr Gln Phe Leu Gln Gly Lys Ser Leu 50 55 60 Asp Ile Leu Leu Glu SerMet Met Gln Asn Leu Lys Gln Val Phe Glu 65 70 75 80 Pro Gln Leu Leu LysThr Thr Ser Trp Asp Thr Ala Glu Leu Tyr Pro 85 90 95 Phe Cys Ser Ser IleIle Phe Glu Ile Thr Phe Thr Thr Ile Tyr Gly 100 105 110 Lys Val Ile ValCys Asp Asn Asn Lys Phe Ile Ser Glu Leu Arg Asp 115 120 125 Asp Phe LeuLys Phe Asp Asp Lys Phe Ala Tyr Leu Val Ser Asn Ile 130 135 140 Pro IleGlu Leu Leu Gly Asn Val Lys Ser Ile Arg Glu Lys Ile Ile 145 150 155 160Lys Cys Phe Ser Ser Glu Lys Leu Ala Lys Met Gln Gly Trp Ser Glu 165 170175 Val Phe Gln Ser Arg Gln Asp Asp Leu Glu Lys Tyr Tyr Val His Glu 180185 190 Asp Leu Glu Ile Gly Ala His His Leu Gly Phe Leu Trp Ala Ser Val195 200 205 Ala Asn Thr Ile Pro Thr Met Phe Trp Ala Thr Tyr Tyr Leu LeuArg 210 215 220 His Pro Glu Ala Met Ala Ala Val Arg Asp Glu Ile Asp ArgLeu Leu 225 230 235 240 Gln Ser Thr Gly Gln Lys Glu Gly Ser Gly Phe ProIle His Leu Thr 245 250 255 Arg Glu Gln Leu Asp Ser Leu Ile Cys Leu 260265 29 base pairs nucleic acid single linear DNA 7 CAATTCGCGG CCGCTTTTTTTTTTTTTTT 29 12 base pairs nucleic acid single linear DNA 8 CGACAGCAACGG 12 16 base pairs nucleic acid single linear DNA 9 AATTCCGTTG CTGTCG16 14 amino acids amino acid <Unknown> linear peptide 10 Phe Xaa Xaa GlyXaa Xaa Xaa Cys Xaa Gly Xaa Xaa Xaa Ala 1 5 10 12 base pairs nucleicacid single linear DNA 11 GATCGCGGCC GC 12 31 base pairs nucleic acidsingle linear DNA 12 GGCCCTCGAG CCACCATGCA GGGGAGCCAC G 31 26 base pairsnucleic acid single linear DNA 13 GGCCGAATTC TCAGCTTCTC CAAGAA 26 42base pairs nucleic acid single linear DNA 14 GACAGGTTTT GTGACCCAAAACAAACTGGA TGGATCGCAA TC 42 42 base pairs nucleic acid single linear DNA15 ATCACGGAGC TCAGCACATG CAGCCTTACT CTGCAAAGCT TC 42 48 base pairsnucleic acid single linear DNA 16 AGCCTTCTGG GTCGTAGCTG ACTCCTGCTGCTGAGCTGCA ACAGCTTT 48 45 base pairs nucleic acid single linear DNA 17TATATCCATA CCAACTTATT GGGAGTCCCA TCCTACCTCA TCAGC 45 506 amino acidsamino acid <Unknown> linear protein 18 Met Met Thr Thr Ser Leu Ile TrpGly Ile Ala Ile Ala Ala Cys Cy 1 5 10 15 Cys Leu Trp Leu Ile Leu Gly IleArg Arg Arg Gln Thr Gly Glu Pr 20 25 30 Pro Leu Glu Asn Gly Leu Gly LeuIle Pro Tyr Leu Gly Cys Ala Le 35 40 45 Gln Phe Gly Ala Asn Pro Leu GluPhe Leu Arg Ala Asn Gln Arg Ly 50 55 60 His Gly His Val Phe Thr Cys LysLeu Met Gly Lys Tyr Val His Ph 65 70 75 80 Ile Thr Asn Pro Leu Ser TyrHis Lys Val Leu Cys His Gly Lys Ty 85 90 95 Phe Asp Trp Lys Lys Phe HisPhe Ala Thr Ser Ala Lys Ala Phe Gl 100 105 110 His Arg Ser Ile Asp ProMet Asp Gly Asn Thr Thr Glu Asn Ile As 115 120 125 Asp Thr Phe Ile LysThr Leu Gln Gly His Ala Leu Asn Ser Leu Th 130 135 140 Glu Ser Met MetGlu Asn Leu Gln Arg Ile Met Arg Pro Pro Val Se 145 150 155 160 Ser AsnSer Lys Thr Ala Ala Trp Val Thr Glu Gly Met Tyr Ser Ph 165 170 175 CysTyr Arg Val Met Phe Glu Ala Gly Tyr Leu Thr Ile Phe Gly Ar 180 185 190Asp Leu Thr Arg Arg Asp Thr Gln Lys Ala His Ile Leu Asn Asn Le 195 200205 Asp Asn Phe Lys Gln Phe Asp Lys Val Phe Pro Ala Leu Val Ala Gl 210215 220 Leu Pro Ile His Met Phe Arg Thr Ala His Asn Ala Arg Glu Lys Le225 230 235 240 Ala Glu Ser Leu Arg His Glu Asn Leu Gln Lys Arg Glu SerIle Se 245 250 255 Glu Leu Ile Ser Leu Arg Met Phe Leu Asn Asp Thr LeuSer Thr Ph 260 265 270 Asp Asp Leu Glu Lys Ala Lys Thr His Leu Val ValLeu Trp Ala Se 275 280 285 Gln Ala Asn Thr Ile Pro Ala Thr Phe Trp SerLeu Phe Gln Met Il 290 295 300 Arg Asn Pro Glu Ala Met Lys Ala Ala ThrGlu Glu Val Lys Arg Th 305 310 315 320 Leu Glu Asn Ala Gly Gln Lys ValSer Leu Glu Gly Asn Pro Ile Cy 325 330 335 Leu Ser Gln Ala Glu Leu AsnAsp Leu Pro Val Leu Asn Ser Ile Il 340 345 350 Lys Glu Ser Leu Arg LeuSer Ser Ala Ser Leu Asn Ile Arg Thr Al 355 360 365 Lys Glu Asp Phe ThrLeu His Leu Glu Asp Gly Ser Tyr Asn Ile Ar 370 375 380 Lys Asp Ser IleIle Ala Leu Tyr Pro Gln Leu Met His Leu Asp Pr 385 390 395 400 Glu IleTyr Pro Asp Pro Leu Thr Phe Lys Tyr Asp Arg Tyr Leu As 405 410 415 GluAsn Gly Lys Thr Lys Thr Thr Phe Tyr Cys Asn Gly Leu Lys Le 420 425 430Lys Tyr Tyr Tyr Met Pro Phe Gly Ser Gly Ala Thr Ile Cys Pro Gl 435 440445 Arg Leu Phe Ala Ile His Glu Ile Lys Gln Phe Leu Ile Leu Met Le 450455 460 Ser Tyr Phe Glu Leu Glu Leu Ile Glu Gly Gln Ala Lys Cys Pro Pr465 470 475 480 Leu Asp Gln Ser Arg Ala Gly Leu Gly Ile Leu Pro Pro LeuAsn As 485 490 495 Ile Glu Phe Lys Tyr Lys Phe Lys His Leu 500 505 14amino acids amino acid <Unknown> linear peptide 19 Phe Gly Leu Gly ThrSer Lys Cys Pro Gly Arg Tyr Phe Ala 1 5 10 14 amino acids amino acid<Unknown> linear peptide 20 Phe Gly Ser Gly Ala Thr Ile Cys Pro Gly ArgLeu Phe Ala 1 5 10 14 amino acids amino acid <Unknown> linear peptide 21Phe Gly Ala Gly Pro Arg Ser Cys Val Gly Glu Met Leu Ala 1 5 10 14 aminoacids amino acid <Unknown> linear peptide 22 Phe Gly Phe Gly Met Arg GlnCys Leu Gly Arg Arg Leu Ala 1 5 10 14 amino acids amino acid <Unknown>linear peptide 23 Phe Gly Cys Gly Ala Arg Val Cys Leu Gly Glu Pro ValAla 1 5 10 14 amino acids amino acid <Unknown> linear peptide 24 Phe GlyTrp Gly Val Arg Gln Cys Leu Gly Arg Arg Ile Ala 1 5 10 14 amino acidsamino acid <Unknown> linear peptide 25 Phe Gly Tyr Gly Val Arg Ala CysLeu Gly Arg Arg Ile Ala 1 5 10 15 amino acids amino acid <Unknown>linear peptide 26 Val Cys Leu Glu Ser Thr Ile Leu Glu Val Leu Arg LeuCys Ser 1 5 10 15 15 amino acids amino acid <Unknown> linear peptide 27Pro Val Leu Asn Ser Ile Ile Lys Glu Ser Leu Arg Leu Ser Ser 1 5 10 15 15amino acids amino acid <Unknown> linear peptide 28 Val Leu Leu Glu HisThr Ile Arg Glu Val Leu Arg Ile Arg Pro 1 5 10 15 15 amino acids aminoacid <Unknown> linear peptide 29 Pro Leu Leu Arg Ala Ala Leu Lys Glu ThrLeu Arg Leu Tyr Pro 1 5 10 15 15 amino acids amino acid <Unknown> linearpeptide 30 Pro Leu Leu Asn Ala Thr Ile Ala Glu Val Leu Arg Leu Pro Val 15 10 15 15 amino acids amino acid <Unknown> linear peptide 31 Pro LeuLeu Lys Ala Ser Ile Lys Glu Thr Leu Arg Leu His Pro 1 5 10 15 15 aminoacids amino acid <Unknown> linear peptide 32 Pro Leu Leu Lys Ala Val LeuLys Glu Thr Leu Arg Leu Tyr Pro 1 5 10 15 266 amino acids amino acid<Unknown> linear protein 33 Gly Lys Tyr Ile Thr Phe Val Leu Asn Pro PheGln Tyr Gln Tyr Va 1 5 10 15 Thr Lys Asn Pro Lys Gln Leu Ser Phe Gln LysPhe Ser Ser Arg Le 20 25 30 Ser Ala Lys Ala Phe Ser Val Lys Lys Leu LeuThr Asp Asp Asp Le 35 40 45 Asn Glu Asp Val His Arg Ala Tyr Leu Leu LeuGln Gly Lys Pro Le 50 55 60 Asp Ala Leu Leu Glu Thr Met Ile Gln Glu ValLys Glu Leu Phe Gl 65 70 75 80 Ser Gln Leu Leu Lys Ile Thr Asp Trp AsnThr Glu Arg Ile Phe Al 85 90 95 Phe Cys Gly Ser Leu Val Phe Glu Ile ThrPhe Ala Thr Leu Tyr Gl 100 105 110 Lys Ile Leu Ala Gly Asn Lys Lys GlnIle Ile Ser Glu Leu Arg As 115 120 125 Asp Phe Phe Lys Phe Asp Asp MetPhe Pro Tyr Leu Val Ser Asp Il 130 135 140 Pro Ile Gln Leu Leu Arg AsnGlu Glu Ser Met Gln Lys Lys Ile Il 145 150 155 160 Lys Cys Leu Thr SerGlu Lys Val Ala Gln Met Gln Gly Gln Ser Ly 165 170 175 Ile Val Gln GluSer Gln Asp Leu Leu Lys Arg Tyr Tyr Arg His As 180 185 190 Asp Ser GluIle Gly Ala His His Leu Gly Phe Leu Trp Ala Ser Le 195 200 205 Ala AsnThr Ile Pro Ala Met Phe Trp Ala Met Tyr Tyr Ile Leu Ar 210 215 220 HisPro Glu Ala Met Glu Ala Leu Arg Asp Glu Ile Asp Ser Phe Le 225 230 235240 Gln Ser Thr Gly Gln Lys Lys Gly Pro Gly Ile Ser Val His Phe Th 245250 255 Arg Glu Gln Leu Asp Ser Leu Val Cys Leu 260 265 18 base pairsnucleic acid single linear DNA 34 CTCCAGCCAT GGTCCTCG 18 18 base pairsnucleic acid single linear DNA 35 GTCTCGCCAT GCTGCTCC 18 18 base pairsnucleic acid single linear DNA 36 CAGCCACCAT GTGGGAGC 18 18 base pairsnucleic acid single linear DNA 37 TCGTCGGGAT GCAGGGAG 18 18 base pairsnucleic acid single linear DNA 38 TTTGCAAAAT GATGACCA 18 18 base pairsnucleic acid single linear DNA 39 TTTGCAAAAT GATGACTA 18 18 base pairsnucleic acid single linear DNA 40 TTTGCAAAAT GATGAGCA 18 18 base pairsnucleic acid single linear DNA 41 TCGGATCCAT GGCTGCGC 18 18 base pairsnucleic acid single linear DNA 42 CACGATCTAT GGCTGTGT 18 18 base pairsnucleic acid single linear DNA 43 TCGCCACCAT GCAGGGAG 18 30 base pairsnucleic acid single linear DNA 44 GGCCCTCGAG CCACCATGCA GGGAGCCACG 30 25base pairs nucleic acid single linear DNA 45 GGCCGAATTC TCAGCTTCTC CAAGA25

1. A DNA molecule selected from the following: (a) DNA moleculescontaining the coding sequence set forth in SEQ Id No: 1 beginning atnucleotide 22 and ending at nucleotide 1541, (b) DNA moleculescontaining the coding sequence set forth in SEQ Id No: 2 beginning atnucleotide 1 and ending at nucleotide 1242, (c) DNA molecules capable ofhybridizing with the DNA molecule defined in (a) or (b) under standardhybridization conditions defined as 2×SSC at 65° C. (d) cytochromeP450-encoding DNA molecules capable of hybridizing with the DNA moleculedefined in (a), (b) or (c) under reduced stringency hybridizationconditions defined as 6×SSC at 55° C.
 2. A DNA molecule according toclaim 1 (c) or (d) comprising an Hct-1 gene-associated sequence ofanother vertebrate species, especially a mammalian species and inparticular a human Hct-1 gene-associated sequence,
 3. A DNA moleculeaccording to claim 2 selected from the following: (e) DNA moleculescomprising one or more sequences selected from (i) the sequencedesignated “intron 2” in SEQ Id No 3, (ii) the sequence designated “exon3” in SEQ Id No 3, (iii) the sequence designated “intron 3” in SEQ Id No3, (iv) the sequence designated “exon 4” in SEQ Id No 3, and (v) thesequence designated “intron 5” in SEQ Id No 3; and (f) DNA moleculescapable of hybridizing with the DNA molecules defined in (e) understandard hybridization conditions defined as 2×SSC at 65° C. (g)cytochrome P450-encoding DNA molecules capable of hybridizing with theDNA molecule defined in (e) or (f) under reduced stringencyhybridization conditions defined as 6×SSC at 55° C.
 4. A DNA moleculecomprising a human Hct-1 gene-associated sequence and selected from thefollowing: (e) DNA molecules comprising one or more sequences selectedfrom (i) the sequence designated “intron 2” in SEQ Id No 3, (ii) thesequence designated “exon 3” in SEQ Id No 3, (iii) the sequencedesignated “intron 3” in SEQ Id No 3, (iv) the sequence designated “exon4” in SEQ Id No 3, and (v) the sequence designated “intron 5” in SEQ IdNo 3; and (f) DNA molecules capable of hybridizing with the DNAmolecules defined in (e) under standard hybridization conditions definedas 2×SSC at 65° C. (g) cytochrome P450-encoding DNA molecules capable ofhybridizing with the DNA molecule defined in (e) or (f) under reducedstringency hybridization conditions defined as 6×SSC at 55° C.
 5. A DNAmolecule comprising a human Hct-1 gene-associated sequence and selectedfrom the following: (h) DNA molecules comprising contiguous pairs ofsequences selected from (i) the sequence designated “intron 2” in SEQ IdNo 3, (ii) the sequence designated “exon 3” in SEQ Id No 3, (iii) thesequence designated “intron 3” in SEQ Id No 3, (iv) the sequencedesignated “exon 4” in SEQ Id No 3, and (v) the sequence designated“intron 5” in SEQ Id No 3; and (i) DNA molecules capable of hybridizingwith the DNA molecules defined in (h) under standard hybridizationconditions defined as 2×SSC at 65° C. (j) cytochrome P450-encoding DNAmolecules capable of hybridizing with the DNA molecule defined in (h) or(i) under reduced stringency hybridization conditions defined as 6×SSCat 55° C.
 6. A DNA molecule comprising a human Hct-1 gene-associatedcoding sequence selected from the following: (k) DNA moleculescomprising a contiguous coding sequence consisting of the sequences“exon 3” and “exon 4” in SEQ Id No 3, and (l) DNA molecules capable ofhybridizing with the DNA molecules defined in (k) under standardhybridization conditions defined as 2×SSC at 65° C. (m) cytochromeP450-encoding DNA molecules capable of hybridizing with the DNA moleculedefined in (k) or (l) under reduced stringency hybridization conditionsdefined as 6×SSC at 55° C.
 7. A DNA molecule encoding an Hct-1gene-associated coding sequence coded for by a DNA molecule as claimedin any of claims 1 to 6, but which differs in sequence from thesequences of the DNA molecules claimed in claims 1 to 6 by virtue of oneor more amino acids of said Hct-1 gene-associated sequences beingencoded by degenerate codons.
 8. A DNA molecule consisting of acontiguous sequence of at least 1-8 nucleotides from the DNA sequenceset forth in SEQ Id Nos: 1, 2 and
 3. 9. A DNA sequence according toclaim 8 containing at least 24 and most preferably at least 30nucleotide taken from said sequence.
 10. The use of a DNA moleculeaccording to claim 8 or claim 9 as a hybridization probe for isolatingor detecting members of gene families and homologous DNA sequencesrelated to the Hct-1 gene, especially a human gene sequence.
 11. The useof a DNA molecule according to claim 8 or claim 9 in the diagnosis ofneuropsychiatric disorders, endocrine disorders, immunologicaldisorders, diseases of cognitive function or neurodegenerative diseases.12. The use of a short (e.g. 10 to 25) oligonucleotide primer, capableof hybridising with a DNA molecule claimed in any of claims 1 to 9 inthe polymerase chain reaction (PCR) amplification of a genomic or cDNAfrom a biological sample for the purpose of diagnosis ofneuropsychiatric disorders, endocrine disorders, immunologicaldisorders, diseases of cognitive function or neurodegenerative diseases.13. A cytochrome P450 protein, at leas. a portion of which is encoded bya DNA molecule as claimed in any of claims 1 to
 7. 14. A proteinselected from the following: (i) the protein designated rat Hct-1comprising the amino acid sequence set forth in SEQ Id No: 1 or aprotein having substantial homology thereto, (ii) the protein designatedmouse Hct-1 comprising the amino acid sequence set forth in SEQ Id No: 2or a protein having substantial homology thereto, or (iii) the proteindesignated human Hct-1 comprising the amino acid sequence set forth inSEQ Id No: 3: or a protein having substantial homology thereto.
 15. Aprotein according to claim 14 having a degree of homology such that atleast 50%, preferably at least 60% and most preferably at least 70% ofthe amino acids match said Seq.ID No: 1, 2 or 3).
 16. A process forproducing a Hct-1 polypeptide, which comprises culturing a transformedhost and recovering the desired Hct-1 polypeptide, characterised in thatthe host is transformed with nucleic acid comprising a coding sequenceas defined in any of claims 1 to
 7. 17. A process according to claim 15wherein the transformed host cell is a yeast, bacterial, insect ormammalian cell.
 18. A process according to claim 16 or claim 17 whereinthe nucleic acid comprises an expression construct or an expressionvector.
 19. A process according to claim 18 wherein the vector is avaccinia virus or baculovirus vector, a yeast plasmid or integrationvector.
 20. An antibody, especially a monoclonal antibody which binds toa Hct-1-protein.
 21. The use of an antibody according to claim 19 in thediagnosis of neuropsychiatric disorders, endocrine disorders,immunological disorders, diseases of cognitive function,neurodegenerative diseases or diseases of cognitive function.
 22. Theuse of a protein according to any of claims 13 to 15, or an antibodyaccording to claim 18 in the design and/or manufacture of an antagonistto Hct-1 protein.
 23. The use of a protein according to any of claims 13to 15, to effect a catalytic transformation of a substrate.
 24. The useaccording to claim 23 wherein the substrate is a steroid.
 25. Atransformed substrate when produced as a result of the use claimed inclaim 23 or claim 24.